[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN115146587B - Method, system, electronic equipment and storage medium for generating character library in handwriting - Google Patents

Method, system, electronic equipment and storage medium for generating character library in handwriting Download PDF

Info

Publication number
CN115146587B
CN115146587B CN202210752549.0A CN202210752549A CN115146587B CN 115146587 B CN115146587 B CN 115146587B CN 202210752549 A CN202210752549 A CN 202210752549A CN 115146587 B CN115146587 B CN 115146587B
Authority
CN
China
Prior art keywords
style
samples
content
font
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210752549.0A
Other languages
Chinese (zh)
Other versions
CN115146587A (en
Inventor
岳强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI YICHUANG INFORMATION TECHNOLOGY CO LTD
Beijing Hanyi Innovation Technology Co ltd
Original Assignee
SHANGHAI YICHUANG INFORMATION TECHNOLOGY CO LTD
Beijing Hanyi Innovation Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI YICHUANG INFORMATION TECHNOLOGY CO LTD, Beijing Hanyi Innovation Technology Co ltd filed Critical SHANGHAI YICHUANG INFORMATION TECHNOLOGY CO LTD
Priority to CN202210752549.0A priority Critical patent/CN115146587B/en
Publication of CN115146587A publication Critical patent/CN115146587A/en
Application granted granted Critical
Publication of CN115146587B publication Critical patent/CN115146587B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/109Font handling; Temporal or kinetic typography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
    • G06V30/245Font recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Character Discrimination (AREA)

Abstract

The disclosure relates to a method, a system, an electronic device and a storage medium for generating a text library in handwriting, wherein the method comprises the following steps: acquiring a font picture; training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs the style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights; the content feature vector and the style feature vector are fed into a generator to generate a Chinese character image. The method and the device greatly shorten the period of word stock manufacture and the labor cost of word stock finishing, enable the generation of personalized word stock to be simpler, more convenient and high in quality, and further promote the personalized application of fonts.

Description

Method, system, electronic equipment and storage medium for generating character library in handwriting
Technical Field
The disclosure relates to the field of word stock generation, in particular to a method, a system, electronic equipment and a storage medium for generating a word stock in handwriting.
Background
With the development of the internet, the upgrading of the portable intelligent device and the continuous innovation of internet companies, the service provided on the network can meet most of basic life demands and entertainment demands, so that the number of people using the network is very large, and the most basic element on the network is the text as an information transmission medium. Many people can read the characters every day, and the characters not only transmit information, but also transmit individuality and strength, culture and connotation, such as business marks of some enterprises, individuality store characters on street head and tail tablets, individuality signatures and the like. Personalized font production is becoming more and more important.
According to the requirements of GB2312, a word stock at least contains 6763 simplified Chinese characters, the data acquisition mode is generally written by an artist or a font lover, a large amount of handwriting is written, and the uniformity of styles is ensured, so that the writer has huge workload, and the data acquisition is particularly time-consuming, labor-consuming and more than a few weeks and months. And obtaining a manuscript file after data acquisition, then using the existing character segmentation and contour extraction to manufacture a primary word stock, and finally, finishing each character by a professional to obtain a word stock file which can be sold and meets the requirements.
With the development of artificial intelligence technology, there are also some practitioners attempting to simplify the above-described conventional processes using artificial intelligence technology, but there are some drawbacks. Zi2Zi (Zi2zi.https:// gitsub.com/kaonashi-tyc/Zi 2 i) and rewrite(YuchenTian.2016.Rewrite:Neural Style Transfer For Chinese Fonts.(2016).Retrieved Nov 23,2016from https://github.com/kaonashi-tyc/Rewrite) are two symbolic tasks, the main content of which is that one style font can be transformed into another style font without training on the character data for the two styles one-to-one. However, the other style of transformation can only be a fixed number and is in the training data set, and the problems of scoring loss, insignificant style and the like exist for the characters generated by the new style. MXFont (Multiple Heads are Better than One: few-shot Font Generation with Multiple Localized Experts, 2021) uses component information to enhance local style generation capabilities, i.e., to enhance the local detail quality of the generated text. The method mainly comprises the steps of firstly enabling an input picture to pass through a plurality of expert networks formed by convolutional neural networks, then enabling the output of the expert networks to be transformed into a style characteristic vector and a content characteristic vector through a full-connection network, splicing the style characteristic vector and the content characteristic vector together, inputting the style characteristic vector and the content characteristic vector into a generator network formed by the convolutional neural networks, and outputting a final generation result. In the training process, the component information of Chinese character splitting is used as a supervision signal. However, in the dimension of the components, not all Chinese characters of 6367 Chinese characters can be split, the split components are extremely unbalanced in distribution, some components are extremely high in occurrence frequency, most other components are extremely low in occurrence frequency, therefore only part of the generated results meet the standard of word stock production, and the rest of the generated effects do not meet the standard of word stock production. There are also documents (GENERATING HANDWRITTEN CHINESE CHARACTERS using CycleGAN, 2018) that use training against neural networks and dense link structures, which can also be trained with only unpaired data, to achieve a transformation from an original font to a target font, by first passing the input font through an encoder neural network to obtain the feature vector of the word, and then converting the feature vector into the feature vector of the target style font through the dense link structure, and finally generating the feature vector of the target style font through the generator network. The method adopts a one-to-one generation mode, if a set of fonts with new styles are to be generated, the new styles are retrained once, the training period is often several days or tens of days, the generated fonts are not clear enough, and the requirements of making a font library are not met.
In summary, the prior art has the following disadvantages:
1) The generated font picture has the condition of stroke or component loss
2) For the generation of a set of fresh air grid, training for tens of days is needed, and the generation period is long.
3) The stroke and font structure of the generated content cannot be controlled.
Disclosure of Invention
The invention provides a method, a system, electronic equipment and a storage medium for generating a handwriting library, which can solve the defects of poor quality, blurred image and long period of generating a set of new style fonts in the conventional method, shorten the time for manufacturing the handwriting library from several weeks of original manual acquisition or tens of days of retraining by using other deep neural network methods to 2-3 hours, and can specify the content style of generated words, and the effect also accords with the standard for manufacturing the handwriting library. In order to solve the technical problems, the present disclosure provides the following technical solutions:
as an aspect of the embodiments of the present disclosure, there is provided a method for generating a database of text in handwriting, including the steps of:
Acquiring a font picture;
training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs the style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
the content feature vector and the style feature vector are fed into a generator to generate a Chinese character image.
Optionally, the step of obtaining the glyph image specifically includes the steps of: and selecting a plurality of font files, and rendering the font files into a plurality of font pictures of white background and black characters.
Optionally, before the step of obtaining the glyph picture, further comprising training a generation countermeasure network, wherein the loss function of the generation countermeasure network comprises countermeasure loss, L1 loss and content loss.
Optionally, the step of fine-tuning the part of the parameters of the generated countermeasure network is further included before the step of obtaining the glyph picture, the fine-tuning is achieved by adding a new consistency loss, and the consistency loss l1_loss is expressed as follows:
L1_loss=||ContEnc(I_c)-ContEnc(I_f)||,
Wherein ContEnc (i_c) is a content feature vector of the content glyph, and ContEnc (i_f) is a content feature vector of the generated glyph.
Optionally, the content encoder is comprised of a plurality of convolutional layer-normalization layer-activation layer structures.
Optionally, the stylistic multi-input encoder is composed of an attention layer for receiving input of a plurality of stylistic samples, a residual layer for acquiring and outputting stylistic feature vectors of the plurality of samples.
As another aspect of an embodiment of the present disclosure, there is provided a handwriting-in-text library generating system, including:
the font image acquisition module acquires font images;
An encoder module for training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs the style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
and a generator for feeding the content feature vector and the style feature vector into the generator to generate a Chinese character image.
Optionally, the system further includes a fine tuning module for fine tuning a loss function in the generating the countermeasure network training process, the loss function adding a consistency loss, the consistency loss having the formula:
L1_loss=||ContEnc(I_c)-ContEnc(I_f)||,
Wherein ContEnc (i_c) is a content feature vector of the content glyph, and ContEnc (i_f) is a content feature vector of the generated glyph.
As another aspect of the embodiments of the present disclosure, there is further provided an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method for generating a database in handwriting described above when executing the computer program.
As another aspect of the embodiments of the present disclosure, there is also a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method for generating a database of handwritten text described above.
When the embodiment of the disclosure is implemented, a user can generate any content font picture with a corresponding style in a short time by only writing a small number of handwriting (more than three hundred characters) systems, so that the time for manufacturing the handwriting Chinese character library specified in GB2312 by the existing method is shortened from tens of days to hours, and compared with other methods for quickly manufacturing the handwriting Chinese character library, the generation effect is better. The whole method does not depend on stroke components and does not need to consume a large amount of manpower to refine and generate a result. Therefore, the manufacturing period of the word stock and the labor cost of finishing the word stock are greatly shortened, the generation of the personalized word stock is simpler, more convenient and high-quality, and the personalized application of the fonts is further promoted. The present disclosure also solves the problem of stroke missing in glyphs generated by existing methods by fine-tuning the use of consistency loss in generating the countermeasure network.
Drawings
Fig. 1 is a flowchart of a method for generating a database in handwriting in embodiment 1;
FIG. 2 is a diagram of a training process for generating an reactance network;
FIG. 3 is a diagram of the effect of font generation;
fig. 4 shows a handwriting Chinese character library a system block diagram is generated.
Fig. 5 (a), 5 (b) and 5 (c) are diagrams for generating examples of chinese character images.
Detailed Description
The following description of the technical solutions in the embodiments of the present disclosure will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments. Based on the embodiments in this disclosure, all other embodiments that a person of ordinary skill in the art would obtain without making any inventive effort are within the scope of protection of this disclosure.
Example 1
As an aspect of the embodiments of the present disclosure, the present embodiment provides a method for generating a text library in handwriting, as shown in fig. 1, including the following steps:
S10, acquiring a font picture;
S20, training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
s30, the content feature vector and the style feature vector are sent to a generator to generate Chinese character images.
Based on the configuration, the embodiment of the disclosure can generate any content font picture of a corresponding style in a short time by only writing a small amount of handwriting by a user, so that the time for manufacturing the handwriting Chinese character library specified in GB2312 by the existing method is shortened from tens of days to hours, and compared with other methods for rapidly manufacturing the handwriting Chinese character library, the method has better generation effect. The whole method does not depend on stroke components and does not need to consume a large amount of manpower to refine and generate a result. Therefore, the manufacturing period of the word stock and the labor cost of finishing the word stock are greatly shortened, the generation of the personalized word stock is simpler, more convenient and high-quality, and the personalized application of the fonts is further promoted.
The steps of the embodiments of the present disclosure are described in detail below, respectively.
S10, acquiring a font picture; the step of obtaining the font image specifically comprises the following steps: and selecting a plurality of font files, and rendering the font files into a plurality of font pictures of white background and black characters. The font files may be font files in format ttf or otf, and then rendering the characters specified in GB2312 from the font file format into a font picture file, such as rendering into a font picture of white background and black characters.
S20, training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
As shown in fig. 2, the training for generating the countermeasure network mainly includes four neural networks composed of convolutional layers, namely, a content encoder ContEnc, a style multiple-input encoder StyleEnc, a generator G, and a discriminator D.
In some embodiments, the content encoder is comprised of a plurality of convolutional layer-normalization layer-activation layer structures. For example, the content encoder ContEnc consists of a 5 convolutional layer-normalized layer-active layer structure, which functions to extract a 256x256 pixel glyph picture as a 512x16x16 content feature vector that is characteristic of the glyph image's content.
In some embodiments, the stylistic multi-input encoder is comprised of an attention layer for receiving input of a plurality of stylistic samples, a residual layer for acquiring and outputting stylistic feature vectors of the plurality of samples. For example, the attention layer accepts input of a plurality of style samples, such as 5 style samples, then calculates a weight relation between the 5 style samples and a generated target style, obtains different weights corresponding to the 5 samples, and finally outputs style characteristics of 1x256 latitude through the residual layer together with the obtained style weights and the style samples as style characterization of the 5 samples.
In some embodiments, the method further comprises the step of training a generated countermeasure network prior to the capturing of the glyph picture, the training of the generated countermeasure network loss function comprising countermeasure losses, L1 losses and content losses. Wherein the fight loss, L1 loss and content loss can be represented by loss functions in the prior art, and the aim of training is to minimize the sum of the three losses.
In some embodiments, the step of generating the partial parameters of the countermeasure network is further included before the step of obtaining the font image, wherein the font is rendered into the font image, and then the new style font generated by the trimmed model is generated by trimming the partial parameters of the basic model. The fine tuning can be completed in 2-3 hours. The fine tuning is achieved by adding a new consistency loss, l1_loss, as follows:
L1_loss=||ContEnc(I_c)-ContEnc(I_f)||,
wherein ContEnc (i_c) is a content feature vector of the content glyph, contEnc (i_f) is a content feature vector of the generated glyph, and it is ensured that the generated content is consistent with the given content.
As shown in fig. 3, a generating effect diagram of fonts in a font library generated by the generator is shown.
The embodiment of the disclosure can use the new style as input, and can generate 6763 characters specified by GB2312 by modifying the given content font. A word stock of 2 ten thousand character sets as specified by GB18030 may also be generated.
Example 2
As another aspect of the embodiment of the present disclosure, the embodiment provides a handwriting Chinese character library generating system 100, as shown in fig. 4, including:
the font image acquisition module 1 acquires font images; for selecting a plurality of font files, such as a plurality of font files in a GB2312 library, and rendering a plurality of font files into a plurality of font pictures of white background and black characters. The font files may be font files in format ttf or otf, and then rendering the characters specified in GB2312 from the font file format into a font picture file, such as rendering into a font picture of white background and black characters.
An encoder module 2 for training the font picture using a content encoder in a generating countermeasure network for extracting a content feature vector in the font picture and a style multi-input encoder for receiving input of a plurality of style samples of a specified style, obtaining a weight relation between the plurality of style samples and a generating target style to obtain different style weights corresponding to the plurality of style samples, and outputting a style feature vector of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
And a generator 3 for feeding the content feature vector and the style feature vector into the generator to generate a Chinese character image.
For example, as shown in fig. 5, for generating an example diagram of a font library in handwriting, fig. 5 (a) is a specified content input, i.e., a content input in a font image; FIG. 5 (b) is a plurality of style samples, namely, designated styles; fig. 5 (c) is a generated chinese character image.
As shown in fig. 2, the training for generating the countermeasure network mainly includes four neural networks composed of convolutional layers, namely, a content encoder ContEnc, a style multiple-input encoder StyleEnc, a generator G, and a discriminator D.
In some embodiments, the content encoder is comprised of a plurality of convolutional layer-normalization layer-activation layer structures. For example, the content encoder ContEnc consists of a 5 convolutional layer-normalized layer-active layer structure, which functions to extract a 256x256 pixel glyph picture as a 512x16x16 content feature vector that is characteristic of the glyph image's content.
In some embodiments, the stylistic multi-input encoder is comprised of an attention layer for receiving input of a plurality of stylistic samples, a residual layer for acquiring and outputting stylistic feature vectors of the plurality of samples. For example, the attention layer accepts input of a plurality of style samples, such as 5 style samples, then calculates a weight relation between the 5 style samples and a generated target style, obtains different weights corresponding to the 5 samples, and finally outputs style characteristics of 1x256 latitude through the residual layer together with the obtained style weights and the style samples as style characterization of the 5 samples.
In some embodiments, the method further comprises the step of training a generated countermeasure network prior to the capturing of the glyph picture, the training of the generated countermeasure network loss function comprising countermeasure losses, L1 losses and content losses. Wherein the fight loss, L1 loss and content loss can be represented by loss functions in the prior art, and the aim of training is to minimize the sum of the three losses.
In some embodiments, the system further comprises a fine tuning module for fine tuning the generation of a loss function in the countermeasure network training process, the loss function adding a consistency loss, the consistency loss having the formula:
L1_loss=||ContEnc(I_c)-ContEnc(I_f)||,
Wherein ContEnc (i_c) is a content feature vector of the content glyph, and ContEnc (i_f) is a content feature vector of the generated glyph.
In some embodiments, the arbiter D takes the same structure as VGG16(Karen Simonyan andAndrewZisserman.2014.Very deep convolutional networks for large-scale image recognition.arXiv preprint arXiv:1409.1556(2014).), inputs the generated image or the real image, and outputs the probability of being the generated image and the probability of being the real image, respectively.
The embodiment of the disclosure can use the new style as input, and can generate 6763 characters specified by GB2312 by modifying the given content font. A word stock of 2 ten thousand character sets as specified by GB18030 may also be generated.
Example 3
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of generating a database in handwriting of embodiment 1 when the computer program is executed.
Embodiment 3 of the present disclosure is merely an example, and should not be construed as limiting the functionality and scope of use of the embodiments of the present disclosure.
The electronic device may be in the form of a general purpose computing device, which may be a server device, for example. Components of an electronic device may include, but are not limited to: at least one processor, at least one memory, a bus connecting different system components, including the memory and the processor.
The buses include a data bus, an address bus, and a control bus.
The memory may include volatile memory such as Random Access Memory (RAM) and/or cache memory, and may further include Read Only Memory (ROM).
The memory may also include program means having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The processor executes various functional applications and data processing by running computer programs stored in the memory.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, etc.). Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. The network adapter communicates with other modules of the electronic device via a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.
It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present application. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.
Example 4
A computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method for generating a database of handwritten Chinese characters in embodiment 1.
More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible embodiment, the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps of implementing the method for generating a text library in handwriting as described in example 1, when said program product is run on the terminal device.
Wherein the program code for carrying out the present disclosure may be written in any combination of one or more programming languages, which program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device, partly on the remote device or entirely on the remote device.
Although embodiments of the present disclosure have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the disclosure, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. A method for generating a character library in handwriting is characterized by comprising the following steps:
Acquiring a font picture;
training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs the style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
Feeding the content feature vector and the style feature vector into a generator to generate a Chinese character image;
the content encoder is composed of a plurality of convolution layer-normalization layer-activation layer structures;
The method further comprises the step of training a generated countermeasure network before the step of acquiring the font picture, wherein the loss function of the generated countermeasure network comprises countermeasure loss, L1 loss and content loss;
The method further comprises the step of fine-tuning the part of the parameters of the generated countermeasure network before the step of obtaining the glyph picture, wherein the fine-tuning is realized by adding a new consistency loss, and the consistency loss L1_loss is expressed as follows:
L1_loss=||ContEnc(I_c)-ContEnc(I_f)||,
Wherein ContEnc (i_c) is a content feature vector of the content glyph, and ContEnc (i_f) is a content feature vector of the generated glyph.
2. The method for generating a text library in handwriting as claimed in claim 1, wherein said step of obtaining a font picture comprises the steps of: and selecting a plurality of font files, and rendering the font files into a plurality of font pictures of white background and black characters.
3. The method of claim 1, wherein the stylistic multi-input encoder is comprised of an attention layer for receiving input of a plurality of stylistic samples, a residual layer for acquiring and outputting stylistic feature vectors of the plurality of samples.
4. A system for generating a database of characters in handwriting, comprising:
the font image acquisition module acquires font images;
An encoder module for training the font picture by using a content encoder and a style multi-input encoder in a generating countermeasure network, wherein the content encoder is used for extracting content feature vectors in the font picture, the style multi-input encoder receives input of a plurality of style samples of a designated style, obtains weight relations between the plurality of style samples and a generating target style so as to obtain different style weights corresponding to the plurality of style samples, and outputs the style feature vectors of the plurality of samples according to the plurality of style samples and the corresponding different style weights;
a generator that feeds the content feature vector and the style feature vector into the generator to generate a Chinese character image;
the content encoder is composed of a plurality of convolution layer-normalization layer-activation layer structures;
The fine tuning module is used for fine tuning the loss function in the process of generating the countermeasure network training, the loss function also comprises consistency loss, and the formula of the consistency loss is as follows:
L1_loss=||ContEnc(I_c)-ContEnc(I_f)||,
Wherein ContEnc (i_c) is a content feature vector of the content glyph, and ContEnc (i_f) is a content feature vector of the generated glyph.
5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of generating a database of handwritten text as claimed in any one of claims 1 to 3 when executing the computer program.
6. A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the method for generating a database in handwriting according to any of claims 1 to 3.
CN202210752549.0A 2022-06-28 2022-06-28 Method, system, electronic equipment and storage medium for generating character library in handwriting Active CN115146587B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210752549.0A CN115146587B (en) 2022-06-28 2022-06-28 Method, system, electronic equipment and storage medium for generating character library in handwriting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210752549.0A CN115146587B (en) 2022-06-28 2022-06-28 Method, system, electronic equipment and storage medium for generating character library in handwriting

Publications (2)

Publication Number Publication Date
CN115146587A CN115146587A (en) 2022-10-04
CN115146587B true CN115146587B (en) 2024-10-15

Family

ID=83409855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210752549.0A Active CN115146587B (en) 2022-06-28 2022-06-28 Method, system, electronic equipment and storage medium for generating character library in handwriting

Country Status (1)

Country Link
CN (1) CN115146587B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095038A (en) * 2021-05-08 2021-07-09 杭州王道控股有限公司 Font generation method and device for generating countermeasure network based on multitask discriminator
CN113792854A (en) * 2021-09-09 2021-12-14 北京百度网讯科技有限公司 Model training and word stock establishing method, device, equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644006B (en) * 2017-09-29 2020-04-03 北京大学 Automatic generation method of handwritten Chinese character library based on deep neural network
US11250252B2 (en) * 2019-12-03 2022-02-15 Adobe Inc. Simulated handwriting image generator
CN111815509B (en) * 2020-09-02 2021-01-01 北京邮电大学 Image style conversion and model training method and device
CN113393370A (en) * 2021-06-02 2021-09-14 西北大学 Method, system and intelligent terminal for migrating Chinese calligraphy character and image styles
CN113837366A (en) * 2021-09-23 2021-12-24 中国计量大学 Multi-style font generation method
CN114139495B (en) * 2021-11-29 2024-10-22 合肥高维数据技术有限公司 Chinese font style migration method based on self-adaptive generation countermeasure network
CN114495118B (en) * 2022-04-15 2022-08-09 华南理工大学 Personalized handwritten character generation method based on countermeasure decoupling

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095038A (en) * 2021-05-08 2021-07-09 杭州王道控股有限公司 Font generation method and device for generating countermeasure network based on multitask discriminator
CN113792854A (en) * 2021-09-09 2021-12-14 北京百度网讯科技有限公司 Model training and word stock establishing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN115146587A (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN111027563A (en) Text detection method, device and recognition system
CN111738169B (en) Handwriting formula recognition method based on end-to-end network model
CN110569359B (en) Training and application method and device of recognition model, computing equipment and storage medium
CN110570481A (en) calligraphy word stock automatic repairing method and system based on style migration
CN114596566B (en) Text recognition method and related device
CN110781672B (en) Question bank production method and system based on machine intelligence
US11599727B2 (en) Intelligent text cleaning method and apparatus, and computer-readable storage medium
CN110991279B (en) Document Image Analysis and Recognition Method and System
CN110851644A (en) Image retrieval method and device, computer-readable storage medium and electronic device
CN114612921B (en) Form recognition method and device, electronic equipment and computer readable medium
CN117746186A (en) Training method of low-rank adaptive model, text image generation method and system
CN116029273A (en) Text processing method, device, computer equipment and storage medium
CN112101364A (en) Semantic segmentation method based on parameter importance incremental learning
Vafaie et al. Handwritten and printed text identification in historical archival documents
Davoudi et al. Ancient document layout analysis: Autoencoders meet sparse coding
DE102022131824A1 (en) Visual speech recognition for digital videos using generative-adversative learning
CN115690276A (en) Video generation method and device of virtual image, computer equipment and storage medium
CN114330514A (en) Data reconstruction method and system based on depth features and gradient information
CN115146587B (en) Method, system, electronic equipment and storage medium for generating character library in handwriting
CN112307749A (en) Text error detection method and device, computer equipment and storage medium
CN111881900A (en) Corpus generation, translation model training and translation method, apparatus, device and medium
US20240153259A1 (en) Single image concept encoder for personalization using a pretrained diffusion model
US20240104951A1 (en) Image and semantic based table recognition
CN114399782B (en) Text image processing method, apparatus, device, storage medium, and program product
CN115690816A (en) Text element extraction method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant