[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111860167A - Face fusion model acquisition and face fusion method, device and storage medium - Google Patents

Face fusion model acquisition and face fusion method, device and storage medium Download PDF

Info

Publication number
CN111860167A
CN111860167A CN202010560733.6A CN202010560733A CN111860167A CN 111860167 A CN111860167 A CN 111860167A CN 202010560733 A CN202010560733 A CN 202010560733A CN 111860167 A CN111860167 A CN 111860167A
Authority
CN
China
Prior art keywords
face
fusion
attribute information
target
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010560733.6A
Other languages
Chinese (zh)
Other versions
CN111860167B (en
Inventor
余席宇
洪智滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010560733.6A priority Critical patent/CN111860167B/en
Publication of CN111860167A publication Critical patent/CN111860167A/en
Application granted granted Critical
Publication of CN111860167B publication Critical patent/CN111860167B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method, a device and a storage medium for obtaining a face fusion model and fusing a face, and relates to the fields of artificial intelligence, deep learning and computer vision, wherein the method comprises the following steps: obtaining a training sample, wherein the training sample comprises: a target face image and a template face image; training by using a training sample to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image, wherein the face fusion model comprises the following steps: acquiring attribute information of a target face in a target face image; carrying out face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of the fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function. The scheme can be used for cloud service, can improve the face fusion effect, and has wide applicability and the like.

Description

Face fusion model acquisition and face fusion method, device and storage medium
Technical Field
The present application relates to computer application technologies, and in particular, to a method and an apparatus for obtaining a face fusion model and face fusion in the fields of artificial intelligence, deep learning, and computer vision, and a storage medium.
Background
At present, the face fusion technology is widely applied to various scenes such as virtual visual images, long and short video interesting playing methods, photo album interesting playing methods and the like. Face fusion techniques typically require preserving identity information of one face image and fusing attribute information of another face image.
In view of low implementation cost, currently, a face fusion method based on blend deformation (blendshape) is mostly adopted for face fusion, but the fusion effect of the method is poor, that is, the obtained fusion image is not true enough, and the like.
Disclosure of Invention
The application provides a method and a device for acquiring a face fusion model and fusing a face and a storage medium.
A face fusion model acquisition method comprises the following steps:
obtaining a training sample, wherein the training sample comprises: a target face image and a template face image;
training by using the training sample to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image, wherein the face fusion model comprises the following steps:
acquiring attribute information of a target face in the target face image;
performing face fusion on the target face image and the template face image to obtain a fused face;
acquiring attribute information of the fused face;
And constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
A face fusion method, comprising:
acquiring a target face image and a template face image;
and inputting the target face image and the template face image into the face fusion model obtained according to the method to obtain an output face fusion result.
A face fusion model acquisition apparatus, comprising: the system comprises a sample acquisition module and a model training module;
the sample acquisition module is used for acquiring a training sample, wherein the training sample comprises: a target face image and a template face image;
the model training module is used for training by using the training sample to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image, and comprises the following steps: acquiring attribute information of a target face in the target face image; performing face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of the fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
A face fusion apparatus comprising: the system comprises an image acquisition module and a face fusion module;
the image acquisition module is used for acquiring a target face image and a template face image;
and the face fusion module is used for inputting the target face image and the template face image into the face fusion model obtained according to the method to obtain an output face fusion result.
An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as described above.
A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.
One embodiment in the above application has the following advantages or benefits: the face fusion model can be trained, wherein the attribute information of the target face and the fusion face can be respectively obtained, the decoupling of identity and attribute is realized, the attribute consistency loss function can be constructed by using the obtained attribute information, the training of the model is guided by using the attribute consistency loss function, and the like, so that the model training effect is improved, the fusion effect when the face fusion is carried out by using the model obtained by training is further improved, a more real fusion image can be obtained, moreover, the corresponding face fusion result can be obtained by giving any target face image and template face image according to the model obtained by training, and the face fusion model has wide applicability, lower realization cost and the like. It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a flowchart of an embodiment of a face fusion model acquisition method according to the present application;
FIG. 2 is a schematic diagram of a face fusion model acquisition process according to the present application;
FIG. 3 is a flowchart of an embodiment of a face fusion method according to the present application;
fig. 4 is a schematic structural diagram illustrating a composition of an embodiment 40 of a face fusion model obtaining apparatus according to the present application;
fig. 5 is a schematic structural diagram illustrating a composition of an embodiment 50 of a face fusion apparatus according to the present application;
FIG. 6 is a block diagram of an electronic device according to the method of an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In addition, it should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Fig. 1 is a flowchart of an embodiment of a face fusion model acquisition method according to the present application. As shown in fig. 1, the following detailed implementation is included.
In 101, training samples are obtained, the training samples including: a target face image and a template face image.
In 102, a face fusion model for performing face fusion on the input target face image and the template face image is obtained by training using the training sample, wherein the training process includes 103 and 106.
In 103, attribute information of the target face in the target face image is acquired.
And in 104, performing face fusion on the target face image and the template face image to obtain a fused face.
At 105, attribute information of the fused face is acquired.
At 106, an attribute consistency loss function is constructed according to the attribute information of the target face and the attribute information of the fused face, and the attribute consistency loss function is used for self-supervision learning of the face fusion model.
It can be seen that, in the embodiment, a face fusion model can be trained, wherein attribute information of a target face and a fusion face can be respectively obtained, decoupling of identity and attribute is achieved, an attribute consistency loss function can be constructed by using the obtained attribute information, training of the model is guided by using the attribute consistency loss function, and the like, so that a model training effect is improved, a fusion effect when face fusion is performed by using the trained model is further improved, a more real fusion image can be obtained, and a corresponding face fusion result can be obtained by giving any target face image and template face image according to the trained model, so that the face fusion model has wide applicability, lower implementation cost, and the like.
During model training, a plurality of training samples can be obtained according to actual needs, and each training sample can respectively include: the target face image and the template face image correspond to different characters respectively, that is, each training sample corresponds to a group of characters respectively, and in addition, the characters corresponding to different training samples can be different.
And aiming at the target face image in the training sample, acquiring attribute information of the target face by using a face three-dimensional reconstruction method. Preferably, a face three-dimensional reconstruction method may be used to obtain parameters of a three-dimensional deformation Model (3DMM, 3D deformable Model) of the target face, and then the attribute information of the target face may be extracted from the 3DMM parameters.
The human face three-dimensional reconstruction method is specifically realized by the prior art. Different dimensions in the 3d dm parameters correspond to information such as the identity, expression, posture, illumination, etc. of the character, respectively. The attribute information mainly includes expression and posture information.
In addition, the face fusion can be carried out on the target face image and the template face image to obtain a fused face. Preferably, a face fusion method based on a generated countermeasure network (GAN) may be used to perform face fusion on the target face image and the template face image, so as to obtain a fused face. In practical application, any GAN-based face fusion method, such as the existing face migration (faceshieter) method, may be adopted.
Face fusion techniques typically require preserving identity information of one face image and fusing attribute information of another face image. In the faceshietter method, an auto encoder (auto encoder) attribute reconstruction network exists for a target face, and features of all scales of the attribute reconstruction network are fused into identity information of a template face.
In addition, the human face fusion method based on the GAN and the human face three-dimensional reconstruction method are combined in the embodiment, and respective advantages are utilized, so that the model training effect is further improved, and the like.
And then, acquiring attribute information of the fused face by using a face three-dimensional reconstruction method. Preferably, the 3DMM parameters of the fused face may be obtained by using a face three-dimensional reconstruction method, and then the attribute information of the fused face may be extracted from the 3DMM parameters.
Based on the human face three-dimensional reconstruction method, the required attribute information can be conveniently and accurately acquired, so that the decoupling of the identity and the attribute is realized, and the subsequent processing and the like are facilitated.
After the processing, the attribute information of the target face and the attribute information of the fused face can be respectively obtained, and the two attribute information are expected to have consistency when the faces are fused, so that an attribute consistency loss function can be constructed according to the attribute information of the target face and the attribute information of the fused face, and the attribute consistency loss function can be used for self-supervision learning of a face fusion model.
Preferably, the L2 norm (L2-norm) of the attribute information of the target face and the attribute information of the fused face may be calculated as an attribute consistency loss function, and the specific form may be: | A-B | non-conducting phosphor2Wherein A and B respectively represent the attribute information of the target face and the attribute information of the fused face.
In addition, the attribute consistency loss function and the identity consistency loss function in the GAN-based face fusion method can be combined to perform the self-supervision learning of the face fusion model. For example, the attribute consistency loss function and the identity consistency loss function in the faceshieter method are combined to perform the self-supervised learning of the face fusion model.
By the method, the consistency of the attributes of the fused face and the target face is ensured, and the consistency of the identities of the fused face and the template face is ensured.
With the above introduction in mind, fig. 2 is a schematic diagram of a face fusion model acquisition process according to the present application. As shown in fig. 2, a face three-dimensional reconstruction method may be used to obtain attribute information of a target face in a target face image, for example, a face three-dimensional reconstruction method may be used to obtain 3DMM parameters of the target face, extract attribute information of the target face from the 3DMM parameters, and perform face fusion on the target face image and a template face image to obtain a fused face, for example, a facieshifter method in a GAN-based face fusion method may be used to perform face fusion on the target face image and the template face image to obtain a fused face, and then a face three-dimensional reconstruction method may be used to obtain attribute information of the fused face, for example, a face three-dimensional reconstruction method may be used to obtain 3DMM parameters of the fused face, extract attribute information of the fused face from the 3DMM parameters, and further construct an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, for example, L2-norm of the attribute information of the target face and the attribute information of the fused face can be calculated as an attribute consistency loss function, and then the attribute consistency loss function and the identity consistency loss function in the faceshietter method can be combined to perform self-supervision learning of the face fusion model, and the like.
Fig. 3 is a flowchart of an embodiment of a face fusion method according to the present application. As shown in fig. 3, the following detailed implementation is included.
In 301, a target face image and a template face image are acquired.
In 302, the target face image and the template face image are input into a face fusion model obtained by pre-training, and an output face fusion result is obtained.
The face fusion model may be a face fusion model obtained according to the method shown in fig. 1.
After the face fusion model is obtained, actual face fusion can be performed by using the face fusion model. For any target face image and template face image to be fused, a face fusion model can be input, and therefore an output face fusion result is obtained.
It is noted that while for simplicity of explanation, the foregoing method embodiments are described as a series of acts or combination of acts, those skilled in the art will appreciate that the present application is not limited by the order of acts, as some steps may, in accordance with the present application, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application. In addition, for parts which are not described in detail in a certain embodiment, reference may be made to relevant descriptions in other embodiments.
The scheme can be used for cloud service. The above is a description of method embodiments, and the embodiments of the present application are further described below by way of apparatus embodiments.
Fig. 4 is a schematic structural diagram of a composition of an embodiment 40 of a face fusion model obtaining apparatus according to the present application. As shown in fig. 4, includes: a sample acquisition module 401 and a model training module 402.
A sample obtaining module 401, configured to obtain a training sample, where the training sample includes: a target face image and a template face image.
A model training module 402, configured to train a face fusion model for performing face fusion on an input target face image and a template face image by using a training sample, including: acquiring attribute information of a target face in a target face image; carrying out face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of the fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
For a target face image in a training sample, the model training module 402 may obtain attribute information of the target face by using a face three-dimensional reconstruction method, for example, may obtain 3DMM parameters of the target face by using a face three-dimensional reconstruction method, and extract the attribute information of the target face from the 3DMM parameters.
Different dimensions in the 3d dm parameters correspond to information such as the identity, expression, posture, illumination, etc. of the character, respectively. The attribute information mainly includes expression and posture information.
The model training module 402 may also perform face fusion on the target face image and the template face image using a GAN-based face fusion method. In practical application, any GAN-based face fusion method, such as the existing faceshieter method, can be adopted.
The model training module 402 may further obtain attribute information of the fused face by using a face three-dimensional reconstruction method, for example, obtain 3DMM parameters of the fused face by using a face three-dimensional reconstruction method, and extract the attribute information of the fused face from the 3DMM parameters.
After the attribute information of the target face and the attribute information of the fused face are respectively obtained, the model training module 402 may calculate L2-norm of the attribute information of the target face and the attribute information of the fused face as an attribute consistency loss function, and the specific form may be: | A-B | non-conducting phosphor2Wherein A and B respectively represent the attribute information of the target face and the attribute information of the fused face.
In addition, the model training module 402 may also perform the self-supervised learning of the face fusion model in combination with the attribute consistency loss function and the identity consistency loss function in the GAN-based face fusion method. For example, the attribute consistency loss function and the identity consistency loss function in the faceshieter method are combined to perform the self-supervised learning of the face fusion model.
Fig. 5 is a schematic structural diagram of a human face fusion apparatus in an embodiment 50 of the present application. As shown in fig. 5, includes: an image acquisition module 501 and a face fusion module 502.
An image obtaining module 501, configured to obtain a target face image and a template face image.
The face fusion module 502 is configured to input the target face image and the template face image into a face fusion model obtained through pre-training, so as to obtain an output face fusion result.
The face fusion model may be a face fusion model obtained according to the method shown in fig. 1, that is, a face fusion model obtained by the apparatus shown in fig. 4.
After the face fusion model is obtained, actual face fusion can be performed by using the face fusion model. For any target face image and template face image to be fused, a face fusion model can be input, and therefore an output face fusion result is obtained.
For the specific work flow of the embodiment of the apparatus shown in fig. 4 and 5, reference is made to the related description in the foregoing method embodiment, and details are not repeated.
In a word, by adopting the scheme of the embodiment of the application device, a face fusion model can be trained, wherein the decoupling of the identity and the attribute of the target face and the fused face can be realized by using a face three-dimensional reconstruction method, an attribute consistency loss function can be constructed by using the obtained attribute information, the training of the model and the like can be guided by using the attribute consistency loss function, so that the model training effect is improved, the fusion effect when the face fusion is carried out by using the trained model is further improved, a more real fusion image can be obtained, and the corresponding face fusion result can be obtained by giving any target face image and template face image aiming at the model obtained by training, so that the application device has wide applicability, lower realization cost and the like; in addition, a better fusion effect can be obtained by adopting the human face fusion method based on the GAN, so that the subsequent processing is facilitated, and the human face fusion method based on the GAN and the human face three-dimensional reconstruction method are combined, so that the respective advantages are utilized, the model training effect is further improved, and the like; in addition, the required attribute information can be conveniently and accurately acquired based on the human face three-dimensional reconstruction method; furthermore, the attribute consistency loss function and the identity consistency loss function in the human face fusion method based on the GAN can be combined to carry out the self-supervision learning of the human face fusion model, thereby not only ensuring the consistency of the attributes of the fused human face and the target human face, but also ensuring the consistency of the identities of the fused human face and the template human face.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 6 is a block diagram of an electronic device according to the method of the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 6, the electronic apparatus includes: one or more processors Y01, a memory Y02, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information for a graphical user interface on an external input/output device (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 6, one processor Y01 is taken as an example.
Memory Y02 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.
Memory Y02 is provided as a non-transitory computer readable storage medium that can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the embodiments of the present application. The processor Y01 executes various functional applications of the server and data processing, i.e., implements the method in the above-described method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory Y02.
The memory Y02 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Additionally, the memory Y02 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory Y02 may optionally include memory located remotely from processor Y01, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, blockchain networks, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device Y03 and an output device Y04. The processor Y01, the memory Y02, the input device Y03 and the output device Y04 may be connected by a bus or in another manner, and the connection by the bus is exemplified in fig. 6.
The input device Y03 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, track pad, touch pad, pointer, one or more mouse buttons, track ball, joystick, or other input device. The output device Y04 may include a display device, an auxiliary lighting device, a tactile feedback device (e.g., a vibration motor), and the like. The display device may include, but is not limited to, a liquid crystal display, a light emitting diode display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific integrated circuits, computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable logic devices) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a cathode ray tube or a liquid crystal display monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area networks, wide area networks, blockchain networks, and the internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (16)

1. A face fusion model acquisition method is characterized by comprising the following steps:
obtaining a training sample, wherein the training sample comprises: a target face image and a template face image;
training by using the training sample to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image, wherein the face fusion model comprises the following steps:
acquiring attribute information of a target face in the target face image;
performing face fusion on the target face image and the template face image to obtain a fused face;
acquiring attribute information of the fused face;
and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
2. The face fusion model acquisition method according to claim 1,
the acquiring of the attribute information of the target face in the target face image includes: acquiring attribute information of the target face by using a face three-dimensional reconstruction method;
the acquiring of the attribute information of the fused face comprises: and acquiring the attribute information of the fused face by using a face three-dimensional reconstruction method.
3. The face fusion model acquisition method according to claim 2,
the acquiring of the attribute information of the target face by using the face three-dimensional reconstruction method comprises the following steps: acquiring three-dimensional deformation model parameters of the target face by using a face three-dimensional reconstruction method, and extracting attribute information of the target face from the three-dimensional deformation model parameters;
the acquiring of the attribute information of the fused face by using the face three-dimensional reconstruction method comprises the following steps: and acquiring three-dimensional deformation model parameters of the fused face by using a face three-dimensional reconstruction method, and extracting attribute information of the fused face from the three-dimensional deformation model parameters.
4. The face fusion model acquisition method according to claim 1,
the face fusion of the target face image and the template face image comprises the following steps:
And performing face fusion on the target face image and the template face image by using a face fusion method based on the generated countermeasure network.
5. The face fusion model acquisition method according to claim 1,
the constructing of the attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face comprises:
and calculating the L2 norm of the attribute information of the target face and the attribute information of the fused face as the attribute consistency loss function.
6. The face fusion model acquisition method according to claim 4,
the performing of the self-supervised learning of the face fusion model by using the attribute consistency loss function includes:
and performing the self-supervision learning of the face fusion model by combining the attribute consistency loss function and the identity consistency loss function in the face fusion method based on the generation countermeasure network.
7. A face fusion method is characterized by comprising the following steps:
acquiring a target face image and a template face image;
inputting the target face image and the template face image into a face fusion model obtained according to the method of any one of claims 1 to 6, and obtaining an output face fusion result.
8. A face fusion model acquisition apparatus, comprising: the system comprises a sample acquisition module and a model training module;
the sample acquisition module is used for acquiring a training sample, wherein the training sample comprises: a target face image and a template face image;
the model training module is used for training by using the training sample to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image, and comprises the following steps: acquiring attribute information of a target face in the target face image; performing face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of the fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
9. The face fusion model acquisition apparatus according to claim 8,
the model training module acquires the attribute information of the target face by using a face three-dimensional reconstruction method, and acquires the attribute information of the fused face by using a face three-dimensional reconstruction method.
10. The face fusion model acquisition apparatus according to claim 9,
the model training module acquires three-dimensional deformation model parameters of the target face by using a face three-dimensional reconstruction method, extracts attribute information of the target face from the three-dimensional deformation model parameters, acquires three-dimensional deformation model parameters of the fused face by using the face three-dimensional reconstruction method, and extracts the attribute information of the fused face from the three-dimensional deformation model parameters.
11. The face fusion model acquisition apparatus according to claim 8,
and the model training module performs face fusion on the target face image and the template face image by using a face fusion method based on a generated countermeasure network.
12. The face fusion model acquisition apparatus according to claim 8,
and the model training module calculates the L2 norm of the attribute information of the target face and the attribute information of the fused face as the attribute consistency loss function.
13. The face fusion model acquisition apparatus according to claim 11,
the model training module combines the attribute consistency loss function and the identity consistency loss function in the face fusion method based on the generated countermeasure network to perform the self-supervision learning of the face fusion model.
14. A face fusion device, comprising: the system comprises an image acquisition module and a face fusion module;
the image acquisition module is used for acquiring a target face image and a template face image;
the face fusion module is configured to input the target face image and the template face image into a face fusion model obtained according to the method of any one of claims 1 to 6, and obtain an output face fusion result.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202010560733.6A 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium Active CN111860167B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010560733.6A CN111860167B (en) 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010560733.6A CN111860167B (en) 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium

Publications (2)

Publication Number Publication Date
CN111860167A true CN111860167A (en) 2020-10-30
CN111860167B CN111860167B (en) 2024-01-26

Family

ID=72986226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010560733.6A Active CN111860167B (en) 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium

Country Status (1)

Country Link
CN (1) CN111860167B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381927A (en) * 2020-11-19 2021-02-19 北京百度网讯科技有限公司 Image generation method, device, equipment and storage medium
CN112950775A (en) * 2021-04-27 2021-06-11 南京大学 Three-dimensional face model reconstruction method and system based on self-supervision learning
CN113052025A (en) * 2021-03-12 2021-06-29 咪咕文化科技有限公司 Training method of image fusion model, image fusion method and electronic equipment
CN113362263A (en) * 2021-05-27 2021-09-07 百度在线网络技术(北京)有限公司 Method, apparatus, medium, and program product for changing the image of a virtual idol
CN113642491A (en) * 2021-08-20 2021-11-12 北京百度网讯科技有限公司 Face fusion method, and training method and device of face fusion model
CN113902956A (en) * 2021-09-30 2022-01-07 北京百度网讯科技有限公司 Training method of fusion model, image fusion method, device, equipment and medium
WO2022218085A1 (en) * 2021-04-13 2022-10-20 腾讯科技(深圳)有限公司 Method and apparatus for obtaining virtual image, computer device, computer-readable storage medium, and computer program product
WO2023020358A1 (en) * 2021-08-20 2023-02-23 腾讯科技(深圳)有限公司 Facial image processing method and apparatus, method and apparatus for training facial image processing model, and device, storage medium and program product
WO2023040679A1 (en) * 2021-09-16 2023-03-23 百果园技术(新加坡)有限公司 Fusion method and apparatus for facial images, and device and storage medium
CN118364340A (en) * 2024-06-19 2024-07-19 深圳市启程教育科技有限公司 Student course short board positioning method, system and storage medium based on deep learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107578017A (en) * 2017-09-08 2018-01-12 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
CN110503703A (en) * 2019-08-27 2019-11-26 北京百度网讯科技有限公司 Method and apparatus for generating image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107578017A (en) * 2017-09-08 2018-01-12 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
CN110503703A (en) * 2019-08-27 2019-11-26 北京百度网讯科技有限公司 Method and apparatus for generating image

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381927A (en) * 2020-11-19 2021-02-19 北京百度网讯科技有限公司 Image generation method, device, equipment and storage medium
CN113052025A (en) * 2021-03-12 2021-06-29 咪咕文化科技有限公司 Training method of image fusion model, image fusion method and electronic equipment
WO2022218085A1 (en) * 2021-04-13 2022-10-20 腾讯科技(深圳)有限公司 Method and apparatus for obtaining virtual image, computer device, computer-readable storage medium, and computer program product
CN112950775A (en) * 2021-04-27 2021-06-11 南京大学 Three-dimensional face model reconstruction method and system based on self-supervision learning
CN113362263A (en) * 2021-05-27 2021-09-07 百度在线网络技术(北京)有限公司 Method, apparatus, medium, and program product for changing the image of a virtual idol
CN113362263B (en) * 2021-05-27 2023-09-15 百度在线网络技术(北京)有限公司 Method, apparatus, medium and program product for transforming an image of a virtual idol
CN113642491A (en) * 2021-08-20 2021-11-12 北京百度网讯科技有限公司 Face fusion method, and training method and device of face fusion model
WO2023020358A1 (en) * 2021-08-20 2023-02-23 腾讯科技(深圳)有限公司 Facial image processing method and apparatus, method and apparatus for training facial image processing model, and device, storage medium and program product
WO2023040679A1 (en) * 2021-09-16 2023-03-23 百果园技术(新加坡)有限公司 Fusion method and apparatus for facial images, and device and storage medium
CN113902956A (en) * 2021-09-30 2022-01-07 北京百度网讯科技有限公司 Training method of fusion model, image fusion method, device, equipment and medium
WO2023050868A1 (en) * 2021-09-30 2023-04-06 北京百度网讯科技有限公司 Method and apparatus for training fusion model, image fusion method and apparatus, and device and medium
CN113902956B (en) * 2021-09-30 2023-04-07 北京百度网讯科技有限公司 Training method of fusion model, image fusion method, device, equipment and medium
CN118364340A (en) * 2024-06-19 2024-07-19 深圳市启程教育科技有限公司 Student course short board positioning method, system and storage medium based on deep learning
CN118364340B (en) * 2024-06-19 2024-08-16 深圳市启程教育科技有限公司 Student course short board positioning method, system and storage medium based on deep learning

Also Published As

Publication number Publication date
CN111860167B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN111860167A (en) Face fusion model acquisition and face fusion method, device and storage medium
CN112001180A (en) Multi-mode pre-training model acquisition method and device, electronic equipment and storage medium
CN110806865B (en) Animation generation method, device, equipment and computer readable storage medium
CN111462029B (en) Visual point cloud and high-precision map fusion method and device and electronic equipment
CN112270711B (en) Model training and posture prediction method, device, equipment and storage medium
CN112529073A (en) Model training method, attitude estimation method and apparatus, and electronic device
CN111582375A (en) Data enhancement strategy searching method, device, equipment and storage medium
JP2021192294A (en) Human 3d key point detection method, model training method and related device
CN111722245A (en) Positioning method, positioning device and electronic equipment
CN111859997A (en) Model training method and device in machine translation, electronic equipment and storage medium
CN111695519B (en) Method, device, equipment and storage medium for positioning key point
CN111695516B (en) Thermodynamic diagram generation method, device and equipment
CN111968203A (en) Animation driving method, animation driving device, electronic device, and storage medium
CN112149741A (en) Training method and device of image recognition model, electronic equipment and storage medium
CN113870399B (en) Expression driving method and device, electronic equipment and storage medium
CN112529180A (en) Method and apparatus for model distillation
CN111524123A (en) Method and apparatus for processing image
CN111539897A (en) Method and apparatus for generating image conversion model
CN111275827A (en) Edge-based augmented reality three-dimensional tracking registration method and device and electronic equipment
CN112241716B (en) Training sample generation method and device
CN112529018A (en) Training method and device for local features of image and storage medium
CN112561059A (en) Method and apparatus for model distillation
CN112529181A (en) Method and apparatus for model distillation
CN111967481A (en) Visual positioning method and device, electronic equipment and storage medium
CN111767990A (en) Neural network processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant