[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN115049556A - StyleGAN-based face image restoration method - Google Patents

StyleGAN-based face image restoration method Download PDF

Info

Publication number
CN115049556A
CN115049556A CN202210736142.9A CN202210736142A CN115049556A CN 115049556 A CN115049556 A CN 115049556A CN 202210736142 A CN202210736142 A CN 202210736142A CN 115049556 A CN115049556 A CN 115049556A
Authority
CN
China
Prior art keywords
image
face
code vector
hidden code
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210736142.9A
Other languages
Chinese (zh)
Inventor
陈鹏
刘亚特
郑春厚
章军
夏懿
梁栋
黄林生
王兵
王刘向
章瑜真
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202210736142.9A priority Critical patent/CN115049556A/en
Publication of CN115049556A publication Critical patent/CN115049556A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a human face image repairing method based on StyleGAN, which comprises the following steps: dividing a real face image into a face area and a background area as a training set; performing data enhancement on the data set, and setting an original image as a label; training the encoder by using the training set and the label to obtain an encoder network; respectively extracting an implicit code vector of a real face image, an implicit code vector of a face area of an image to be restored and an implicit code characteristic diagram of a background area of the image to be restored by utilizing an encoder network; the hidden code vector of the real face image and the hidden code vector of the face area of the image to be repaired are mixed to obtain the hidden code vector of the mixed face, and the hidden code vector of the mixed face and the hidden code characteristic diagram of the background area of the image to be repaired are input into a StyleGAN generator network together to obtain the repaired face image. The method and the device have the advantages that the face image restoration capability is greatly improved, and the structural similarity is well guaranteed in the restoration process.

Description

StyleGAN-based face image restoration method
Technical Field
The application relates to the field of computer vision, in particular to a human face image restoration method based on StyleGAN.
Background
In recent years, the quality of images generated by a countermeasure network (genetic adaptive Networks) is remarkably improved, and particularly, the prior art can randomly generate high-quality face images through a neural network. The most advanced generation of the anti-network StyleGAN obtains the most advanced visual quality on a high-resolution image, and in addition, the StyleGAN has a potential space W in which attribute de-entanglement can be carried out, and a face image is randomly generated by randomly sampling in the space W. The real image is embedded into the W space, namely, the hidden code vector of the real image is obtained and then input into a generator network of StyleGAN, so that a reconstruction result can be obtained. The existing research finds that the real image is embedded into the expanded W + space, and a finer reconstructed image can be obtained. Two methods are mainly used for embedding the real image into the W + space, wherein one method obtains the optimal reconstructed image by continuously optimizing the hidden code vector; in another method, a hidden code vector is obtained by one-time forward propagation through a coder method, so that a reconstruction result is obtained. Because the generator model of the StyleGAN contains rich face image information, the image restoration can be completed by using the face prior information in the generator. Meanwhile, the StyleGAN utilizes the hidden code vector to control and generate content, and the hidden code vector is input to different layers in the StyleGAN generator network, so that the generation results of different scales can be controlled.
The current face image restoration technology usually uses a preset algorithm, the difference between the reconstructed result and the original image is large, the structural similarity can not be well ensured in the restoration process, the texture and the luster of real skin can not be given, the overall effect is not ideal, and the restoration work is inconvenient. The traditional restoration method relies on boundary information and texture features of an image to be restored, and the methods are generally based on a mathematical principle, so that the capability of generating information is poor, and the robustness and the universality are poor. In conclusion, the face image restoration method has a larger promotion space.
Disclosure of Invention
By providing the StyleGAN-based face image restoration method, the technical problems that in the prior art, the difference between a reconstructed result and an original image is large, and structural similarity may not be well guaranteed in a restoration process are solved, the face image restoration capability is greatly improved, and structural similarity is well guaranteed in the restoration process.
The embodiment of the application provides a method for repairing a human face image based on StyleGAN, which comprises the following steps: dividing a real face image into a face area and a background area, and using the face area and the background area as a training set; performing data enhancement on the data set by using horizontal overturning, and setting an original image as a label; training an encoder by using the training set and the label to obtain an encoder network; respectively extracting an implicit code vector of a real face image, an implicit code vector of a face area of an image to be restored and an implicit code characteristic diagram of a background area of the image to be restored by utilizing the encoder network; and mixing the hidden code vector of the real face image and the hidden code vector of the face area of the image to be repaired to obtain a hidden code vector of a mixed face, and inputting the hidden code vector of the mixed face and the hidden code feature map of the background area of the image to be repaired into a StyleGAN generator network together to obtain the repaired face image.
Further, training an encoder using the training set and the labels, comprising the steps of: and coding the image, namely dividing the face region and the background region into two parts for coding, wherein aiming at the face region, an encoder structure combining ResNet50 and an SE attention module is utilized to code the input face region image, so as to obtain a hidden code vector of the face part. Extracting features of the background by using a convolutional neural network aiming at the background area to obtain a hidden code feature map of the background part; reconstructing an image, inputting the hidden code vectors of the human face part and the background part into a StyleGAN2 generator to obtain a reconstructed image; and (4) optimizing the encoder, namely calculating the L2 distance between pixels, the perception similarity score and the L2 distance of the human face identity characteristic according to the label image and the reconstructed image, and optimizing the encoder network to obtain the trained encoder network.
Further, the encoder structure of ResNet50 in combination with SE attention module is used to extract the hidden code vector of the face region image.
Furthermore, the dimension of the face hidden code vector is 18 × 512, and the dimension of the background hidden code feature map is 512 × 64.
Further, the encoder is optimized by utilizing three loss functions; wherein the first loss function is to calculate an L2 distance between the image label and the generated image from the pixel values; the second loss function is to respectively extract the deep characteristic information of the image tag and the generated image by using a VGG16 neural network, and calculate the distance L2 between the deep characteristic information of the image tag and the deep characteristic information of the generated image; the third loss function is to extract face feature information between the image label and the generated image by using a face recognition neural network, and calculate an L2 distance according to the face features of the image label and the generated image.
Further, the hidden code vector of the real face image and the hidden code vector of the face area of the image to be restored are mixed according to the proportion of 8: 10.
One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:
1. due to the adoption of the encoder method, the reconstruction work of the damaged image can be completed by one-time forward transmission, and the speed is high; meanwhile, the repairing method utilizes rich human face prior knowledge in StyleGAN, so that the repairing details of the five sense organs are more accurate and real.
2. Because the accurate repair of the damaged face image is realized through the pre-trained model, the true skin texture and luster can be given to the image.
Drawings
Fig. 1 is a flowchart of a method for repairing a human face image based on StyleGAN in an embodiment of the present application;
FIG. 2 is a flow chart of encoder training in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a face image restoration method in the embodiment of the present application.
Detailed Description
The embodiment of the application discloses a StyleGAN-based face image restoration method, which solves the technical problems that in the prior art, the difference between a reconstructed result and an original image is large, and the structural similarity can not be well guaranteed in the restoration process.
In view of the above technical problems, the technical solution provided by the present application has the following general idea: dividing a real face image into a face area and a background area, and using the face area and the background area as a training set; performing data enhancement on the data set by using horizontal overturning, and setting an original image as a label; training an encoder by using the training set and the label to obtain an encoder network; respectively extracting an implicit code vector of a real face image, an implicit code vector of a face area of an image to be restored and an implicit code characteristic diagram of a background area of the image to be restored by utilizing the encoder network; and mixing the hidden code vector of the real face image and the hidden code vector of the face area of the image to be repaired to obtain a hidden code vector of a mixed face, and inputting the hidden code vector of the mixed face and the hidden code feature map of the background area of the image to be repaired into a StyleGAN generator network together to obtain the repaired face image.
In order to make the above-mentioned basic method of the embodiments of the present application more comprehensible, specific embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a method for repairing a human face image based on StyleGAN in an embodiment of the present application, which is described in detail below through specific steps.
And S1, dividing the real face image into a face area and a background area, and using the face area and the background area as a training set.
In a specific implementation, the real face image can be segmented into a face region and a background region through a semantic segmentation network.
In a specific implementation, the photo in the data set of the real face image is a self-portrait photo of a real-world person, and the data set of the real face image is obtained after collection.
In a specific implementation, in the face region image, we fill the missing background portion with RGB (0,0,0), and in the background region image, we fill the missing face portion with RGB (0,0, 0).
In the specific implementation, a large number of real face images are used for training the StyleGAN, and a StyleGAN generator model capable of stably generating diversified face images is trained.
S2, data enhancement is performed on the data set by horizontal flipping, and the original image is set as a label.
In a specific implementation, the undivided original image may be taken as the label image.
And S3, training the encoder by using the training set and the label to obtain an encoder network.
In a specific implementation, as shown with reference to fig. 2, training may be performed by:
and S31, encoding the image, and dividing the face region and the background region into two parts for encoding, wherein aiming at the face region, an encoder structure combining ResNet50 and an SE attention module is utilized to encode the input face region image to obtain a hidden code vector of the face part. And aiming at the background area, extracting features of the background by using a convolutional neural network to obtain a hidden code feature map of the background part.
In specific implementation, for an encoder network for processing a face region, a structure of combining ResNet50 and an SE attention module can be used, 23 convolution blocks are provided in total, each convolution block comprises a BatchNormal layer, a two-dimensional convolution layer, a LeakyReLU activation function and an SE attention module, and an input is connected with an output of the SE module after being subjected to maximum pooling, and the jump connection structure improves information circulation and effectively avoids the problem of gradient disappearance caused by too deep network.
And the feature map f1 output by the 6 th convolution block, the feature map f2 output by the 20 th convolution block, and the feature map f3 output by the 23 th convolution block may be extracted, and added and connected by upsampling, and converted into feature maps c1, c2, and c3, where c1 is f3, c2 is upsample (c1) + f2, and c3 is upsample (c2) + f1, the shallow features contain more detailed information, and the deep features focus more on the whole and do not focus on image details. The network structure of the feature pyramid is used for fusing deep-layer features and shallow-layer features, so that the global features and semantic information of the image can be kept while the detailed information is concerned.
In a specific implementation, a network module for constructing a signature transformation hidden code vector is composed of a two-dimensional convolution, a LeakyReLU activation function and a full connection layer, and processes c1, c2 and c3 respectively, so that a signature c1 is transformed into a hidden code vector of 3 x 512 dimensions, a signature c2 is transformed into a hidden code vector of 4 x 512 dimensions, and a signature c3 is transformed into a hidden code vector of 11 x 512 dimensions. And splicing the obtained hidden code vectors to obtain the final hidden code vector of 18 x 512 dimensions.
In the specific implementation, for the encoder network for processing the background area image, the method uses the convolution block which is the same as the human face encoder, and because the background is processed into the hidden code feature map, only the 6-layer convolution block is used for processing the background image. Each convolution block comprises a BatchNormal layer, a two-dimensional convolution layer, a ReLU activation function and an SE attention module, the input of each convolution block is connected with the output of the SE module after being subjected to maximum pooling, and a background area image is processed into a 512 x 64 dimensional hidden code feature map through a background encoder network.
S32, reconstructing the image, inputting the hidden code vector of the human face part and the background part into a StyleGAN2 generator, and obtaining the reconstructed image.
In a specific implementation, the output of the encoder network is connected to the input of the StyleGAN network. And connecting the output of the face image encoder with the input of the StyleGAN generator, and fusing the output of the background image encoder with the feature map of the intermediate layer of the StyleGAN generator. And inputting the 18 x 512-dimensional hidden code vectors output by the face image encoder into different layers in a StyleGAN generator, and controlling the face generation effect of different scales. And (3) performing weighted fusion on the 512 x 64 output by the background image encoder and the characteristic map of the interlayer of the StyleGAN generator, and realizing accurate reconstruction of the background by inhibiting and enhancing certain regions of the characteristic map of the interlayer of the generator.
In a specific implementation, when an encoder is trained, the weight of the StyleGAN generator network is fixed, and the encoder is optimized by calculating loss by using an image generated by the StyleGAN generator and a preset label image.
The encoder is optimized for measuring the similarity between the generated image and the label image and using the similarity to calculate the loss. The overall loss function is L, the function is composed of three loss functions, the first loss function is the mean square error L between the image label and the generated image calculated according to the pixel value mse . The second loss function is to use VGG16 neural network to extract the deep feature information of the image label and the generated image respectively, and calculate the mean square error L between the two deep feature information lpips . The third loss function is to extract image objects by using a face recognition neural networkFace feature information between the sign and the generated image, and a mean square error L is calculated according to the face features of the sign and the generated image id
L mse =‖I-G(E(I))‖ 2
L lpips =‖LPIPS(I)-LPIPS(G(E(I)))‖ 2
L id =‖ID(I)-ID(G(E(I)))‖ 2
Where I is the input image, E is the trained encoder network, and G is the trained StyleGAN generator network. LPIPS is a pre-trained VGG16 network and is used for extracting deep features of images and calculating the perception similarity of the two images. The ID is a pre-trained face recognition network used for extracting the identity characteristics of the face in the image.
The total loss function is L total L total =λ mse L mselpips L lpipsid L id
Wherein L is mse Is the mean square error, λ, between the pixel values of the two images mse The weight coefficient of the loss is 1.0. L is lpips Is the mean square error, λ, of the deep features of the two images lpips 0.8 is the weight coefficient of the loss. L is id Is the mean square error, lambda, of the facial features of the two images id 0.5 is the weight coefficient of the loss.
And S33, optimizing the encoder, calculating the distance L2 between pixels, the perception similarity score and the distance L2 of the human face identity characteristic according to the label image and the reconstructed image, and optimizing the encoder network to obtain the trained encoder network.
In a specific implementation, the batch size can be set to be 8, the iteration number is 30 ten thousand, and the learning rate is 1 e-4. According to the batch size of 8, 8 samples are taken out from a real face image every time, a semantic segmentation algorithm is utilized to obtain a face image and a background image of the 8 samples, the face image and the background image are respectively input into a face encoder network and a background encoder network to obtain a corresponding hidden code vector and a hidden code characteristic image, the hidden code vector and the hidden code characteristic image are input into a StyleGAN generator to obtain a generated image, forward propagation is completed, loss is calculated through a well-set loss function and weight, and the face encoder network and the background encoder network are optimized through backward propagation.
And S4, respectively extracting the hidden code vector of the real face image, the hidden code vector of the face region of the image to be repaired and the hidden code characteristic diagram of the background region of the image to be repaired by using the encoder network.
In specific implementation, as shown in fig. 3, a face recognition library Dlib may be used to perform face key point positioning on an image to be restored, cut the image to obtain a face image to be restored, and then use a semantic segmentation algorithm to divide the face image to be restored into a face region image and a background region image.
And S5, mixing the hidden code vector of the real face image with the hidden code vector of the face area of the image to be repaired to obtain a hidden code vector of a mixed face, and inputting the hidden code vector of the mixed face and the hidden code feature map of the background area of the image to be repaired into a StyleGAN generator network together to obtain the repaired face image.
In the specific implementation, the hidden code vector of the human image to be restored and the hidden code vector of the real human face image are mixed to obtain a mixed hidden code vector, the mixing ratio is 8:10, the former 8 x 512 dimensions in the hidden code vector of the human face to be restored and the latter 10 x 512 dimensions in the hidden code vector of the real human face are used for splicing and being called as a new 18 x 512-dimensional hidden code vector. Because the StyleGAN realizes the generation of different face images by controlling the hidden code vectors, different dimensions in the hidden code vectors control the generation of image effects with different dimensions. The mixing ratio was set to 8:10, under the condition of fully considering the human face prior information contained in the StyleGAN generator network, the information of the rough human face five sense organs style, appearance and the like in the human face image to be restored is retained.
And inputting the mixed hidden code vector and the background hidden code characteristic diagram of the image to be repaired into a StyleGAN generator network, and outputting a reconstructed image. Because the background in each picture is unique, the hidden code vector is used for simultaneously storing the face information and the background information, the burden is heavy, the face image and the background image are separately processed, and the hidden code feature image is used for separately storing the background information, so that the reconstruction of diversified background information is facilitated.
In summary, by adopting the style gan-based face image restoration method, the restoration of facial features, skin, texture and luster of the image to be restored is ensured while the identity information of the face to be restored is maintained. The method comprises the steps of firstly training a StyleGAN generator to obtain rich face priori knowledge, secondly enabling an encoder network to enable the encoder to accurately express face information and background information through hidden code vectors and characteristic graphs by setting pixel-level loss, integral perception similar loss and face attribute similar loss on an image, reconstructing the image under the dual control of the hidden code vectors and the hidden code characteristic graphs, wherein the reconstructed image has facial feature and appearance information of a face image to be repaired, and can increase skin luster and texture.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction system which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (6)

1. A human face image restoration method based on StyleGAN is characterized by comprising the following steps:
dividing a real face image into a face area and a background area, and using the face area and the background area as a training set;
performing data enhancement on the data set by using horizontal overturning, and setting an original image as a label;
training an encoder by using the training set and the label to obtain an encoder network;
respectively extracting an implicit code vector of a real face image, an implicit code vector of a face area of an image to be restored and an implicit code characteristic diagram of a background area of the image to be restored by utilizing the encoder network;
and mixing the hidden code vector of the real face image and the hidden code vector of the face area of the image to be repaired to obtain a hidden code vector of a mixed face, and inputting the hidden code vector of the mixed face and the hidden code feature map of the background area of the image to be repaired into a StyleGAN generator network together to obtain the repaired face image.
2. A styligan-based human face image inpainting method as claimed in claim 1, wherein training the encoder with said training set and said label comprises the steps of:
and coding the image, namely dividing the face region and the background region into two parts for coding, wherein aiming at the face region, an encoder structure combining ResNet50 and an SE attention module is utilized to code the input face region image, so as to obtain a hidden code vector of the face part. Extracting features of the background by using a convolutional neural network aiming at the background area to obtain a hidden code feature map of the background part;
reconstructing an image, inputting the hidden code vectors of the human face part and the background part into a StyleGAN2 generator to obtain a reconstructed image;
and (4) optimizing the encoder, namely calculating the L2 distance between pixels, the perception similarity score and the L2 distance of the human face identity characteristic according to the label image and the reconstructed image, and optimizing the encoder network to obtain the trained encoder network.
3. The style gan-based human face image inpainting method as claimed in claim 2, wherein the encoder structure of ResNet50 in combination with SE attention module is used to extract the hidden code vector of the human face region image.
4. The method as claimed in claim 2, wherein the dimension of the face hidden code vector is 18 × 512, and the dimension of the background hidden code feature map is 512 × 64.
5. The method as claimed in claim 2, wherein the encoder is optimized by using three loss functions; wherein the first loss function is to calculate an L2 distance between the image label and the generated image from the pixel values; the second loss function is to respectively extract the deep characteristic information of the image tag and the generated image by using a VGG16 neural network, and calculate the distance L2 between the deep characteristic information of the image tag and the deep characteristic information of the generated image; the third loss function is to extract face feature information between the image label and the generated image by using a face recognition neural network, and calculate an L2 distance according to the face features of the image label and the generated image.
6. The method as claimed in claim 1, wherein the hidden code vector of the real face image and the hidden code vector of the face region of the image to be restored are mixed in a ratio of 8: 10.
CN202210736142.9A 2022-06-27 2022-06-27 StyleGAN-based face image restoration method Pending CN115049556A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210736142.9A CN115049556A (en) 2022-06-27 2022-06-27 StyleGAN-based face image restoration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210736142.9A CN115049556A (en) 2022-06-27 2022-06-27 StyleGAN-based face image restoration method

Publications (1)

Publication Number Publication Date
CN115049556A true CN115049556A (en) 2022-09-13

Family

ID=83164006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210736142.9A Pending CN115049556A (en) 2022-06-27 2022-06-27 StyleGAN-based face image restoration method

Country Status (1)

Country Link
CN (1) CN115049556A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631527A (en) * 2022-10-31 2023-01-20 福州大学至诚学院 Angle self-adaption based hair style attribute editing method and system
CN115861343A (en) * 2022-12-12 2023-03-28 中山大学·深圳 Method and system for representing arbitrary scale image based on dynamic implicit image function
CN116362972A (en) * 2023-05-22 2023-06-30 飞狐信息技术(天津)有限公司 Image processing method, device, electronic equipment and storage medium
CN116884077A (en) * 2023-09-04 2023-10-13 上海任意门科技有限公司 Face image category determining method and device, electronic equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631527A (en) * 2022-10-31 2023-01-20 福州大学至诚学院 Angle self-adaption based hair style attribute editing method and system
CN115861343A (en) * 2022-12-12 2023-03-28 中山大学·深圳 Method and system for representing arbitrary scale image based on dynamic implicit image function
CN115861343B (en) * 2022-12-12 2024-06-04 中山大学·深圳 Arbitrary scale image representation method and system based on dynamic implicit image function
CN116362972A (en) * 2023-05-22 2023-06-30 飞狐信息技术(天津)有限公司 Image processing method, device, electronic equipment and storage medium
CN116362972B (en) * 2023-05-22 2023-08-08 飞狐信息技术(天津)有限公司 Image processing method, device, electronic equipment and storage medium
CN116884077A (en) * 2023-09-04 2023-10-13 上海任意门科技有限公司 Face image category determining method and device, electronic equipment and storage medium
CN116884077B (en) * 2023-09-04 2023-12-08 上海任意门科技有限公司 Face image category determining method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Golts et al. Unsupervised single image dehazing using dark channel prior loss
CN111340122B (en) Multi-modal feature fusion text-guided image restoration method
CN110443842B (en) Depth map prediction method based on visual angle fusion
CN109255831B (en) Single-view face three-dimensional reconstruction and texture generation method based on multi-task learning
CN115049556A (en) StyleGAN-based face image restoration method
CN111275518A (en) Video virtual fitting method and device based on mixed optical flow
CN112184585B (en) Image completion method and system based on semantic edge fusion
CN111832745A (en) Data augmentation method and device and electronic equipment
CN113808005A (en) Video-driving-based face pose migration method and device
CN111932458B (en) Image information extraction and generation method based on inter-region attention mechanism
CN112686816A (en) Image completion method based on content attention mechanism and mask code prior
CN110766623A (en) Stereo image restoration method based on deep learning
CN116402067B (en) Cross-language self-supervision generation method for multi-language character style retention
CN115272437A (en) Image depth estimation method and device based on global and local features
Wang et al. Unsupervised deep exemplar colorization via pyramid dual non-local attention
CN116863053A (en) Point cloud rendering enhancement method based on knowledge distillation
CN106815879B (en) A kind of quick texture synthesis method based on LBP feature
CN109829857B (en) Method and device for correcting inclined image based on generation countermeasure network
CN111064905B (en) Video scene conversion method for automatic driving
CN118154770A (en) Single tree image three-dimensional reconstruction method and device based on nerve radiation field
Huang et al. Single image super-resolution reconstruction of enhanced loss function with multi-gpu training
Yang et al. R 2 Human: Real-Time 3D Human Appearance Rendering from a Single Image
CN116342385A (en) Training method and device for text image super-resolution network and storage medium
CN116485892A (en) Six-degree-of-freedom pose estimation method for weak texture object
Yao et al. A Generative Image Inpainting Model Based on Edge and Feature Self‐Arrangement Constraints

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination