CN111368763A

CN111368763A - Image processing method and device based on head portrait and computer readable storage medium

Info

Publication number: CN111368763A
Application number: CN202010158270.0A
Authority: CN
Inventors: 刘洁; 郑雷
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2020-07-03

Abstract

The invention provides an image processing method and device based on head portraits and a computer readable storage medium, comprising the following steps: constructing a first processing model through a first training set; constructing a second processing model through a second training set; acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait; under the condition that the gender is the first gender, inputting the initial head portrait into a first processing model to obtain a first special-effect head portrait; under the condition that the gender is the second gender, inputting the initial head portrait into a second processing model to obtain a second special-effect head portrait; wherein the first training set comprises a first initial training avatar of a first gender and a first special effect training avatar; the second training set includes a second initial training avatar and a second special effect training avatar of a second gender. The invention can enable the processing model to process the head portrait picture of the corresponding gender more intensely, and the processing model can correspondingly generate high-quality characteristic effect according to the gender of the initial head portrait of the user.

Description

Image processing method and device based on head portrait and computer readable storage medium

Technical Field

The invention belongs to the technical field of computers, and particularly relates to an image processing method and device based on a head portrait and a computer readable storage medium.

Background

The image stylization is to render an image into a painting with artistic style, so that the original content of the image is kept, and meanwhile, the artistic effect is added to the image through a rendering technical means.

In the prior art, when a user creates an account in an application, an image may be taken or an image may be selected from an album as an account avatar, and currently, personalized modification of the avatar includes: a set of fixed and universal artistic effects is added to the head portrait so as to obtain the personalized head portrait satisfied by the user. Such as adding filters to the head portrait, adjusting the head portrait to a watercolor style, etc.

However, in the current scheme, there is a large difference in image characteristics between the head portraits of different genders, for example, the head portraits of female genders are mostly long hair, and the head portraits of male genders are mostly short hair. The account head portrait is personalized and modified through a set of fixed and universal artistic effects, so that the generated artistic effect is rigid, the gender attribute is not available, and the personalized modification is poor in functionality.

Disclosure of Invention

In view of the above, the invention provides an image processing method and device based on a head portrait and a computer-readable storage medium, which solve the problem that the personalized modification of the account head portrait in the current scheme has a rather rigid artistic effect and has poor functionality.

According to a first aspect of the present invention, there is provided an avatar-based image processing method, which may include:

constructing a first processing model through a first training set;

constructing a second processing model through a second training set;

acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait;

under the condition that the gender is the first gender, inputting the initial head portrait into the first processing model to obtain a first special-effect head portrait;

under the condition that the gender is the second gender, inputting the initial head portrait into the second processing model to obtain a second special-effect head portrait;

wherein the first training set comprises a first initial training avatar of the first gender and a first special-effect training avatar corresponding to the first initial training avatar; the second training set comprises a second initial training avatar of the second gender and a second special effect training avatar corresponding to the second initial training avatar.

According to a second aspect of the present invention, there is provided an avatar-based image processing apparatus, which may include:

the first establishing module is used for establishing a first processing model through a first training set;

the second establishing module is used for establishing a second processing model through a second training set;

the first acquisition module is used for acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait;

the first processing module is used for inputting the initial head portrait into the first processing model under the condition that the gender is the first gender to obtain a first special-effect head portrait;

the second processing module is used for inputting the initial head portrait into the second processing model under the condition that the gender is the second gender to obtain a second special-effect head portrait;

In a third aspect, the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the steps of the avatar-based image processing method according to the first aspect.

Aiming at the prior art, the invention has the following advantages:

the invention provides an image processing method based on head portraits, which comprises the following steps: constructing a first processing model through a first training set; constructing a second processing model through a second training set; acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait; under the condition that the gender is the first gender, inputting the initial head portrait into a first processing model to obtain a first special-effect head portrait; under the condition that the gender is the second gender, inputting the initial head portrait into a second processing model to obtain a second special-effect head portrait; the first training set comprises a first initial training head portrait of a first gender and a first special-effect training head portrait corresponding to the first initial training head portrait; the second training set comprises a second initial training avatar of a second gender and a second special effect training avatar corresponding to the second initial training avatar. According to the invention, the processing models corresponding to different sexes can be obtained through the training sets of different sexes, so that the processing models can process the head portrait pictures of the corresponding sexes more intensely, the processing models can correspondingly generate high-quality characteristic effects according to the sexes of the initial head portraits of the users, and the special effect processing quality of the head portraits of the users is improved.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flowchart illustrating steps of a method for processing an image based on an avatar according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating steps of another avatar-based image processing method according to an embodiment of the present invention;

fig. 3 is a block diagram of an image processing apparatus based on an avatar according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

Fig. 1 is a flowchart of steps of an image processing method based on an avatar according to an embodiment of the present invention, applied to a terminal, and as shown in fig. 1, the method may include:

step 101, a first processing model is constructed through a first training set.

The first training set comprises a first initial training head portrait of the first gender and a first special-effect training head portrait corresponding to the first initial training head portrait.

In the embodiment of the present invention, the first initial training avatar and the first special effect training avatar may be obtained by crawling from a preset avatar database or a material database, and in addition, the first initial training avatar and the first special effect training avatar may also be obtained by downloading from the internet.

The gender of the avatar in the first initial training avatar and the first special-effect training avatar can be defined, for example, the first gender of the first initial training avatar and the first special-effect training avatar is male, and the second gender of the second initial training avatar and the second special-effect training avatar is female; or the first sex of the obtained first initial training head portrait and the first special effect training head portrait is female, and the second sex of the obtained second initial training head portrait and the second special effect training head portrait is male.

In addition, the special effect attribute of the first special effect training avatar may be a specific special effect, such as a filter special effect, a watercolor special effect, a mosaic special effect, and the like.

For example, a certain number of male user real head portraits, which are head portrait pictures without any special effects added, may be extracted as the first initial training head portraits from the public material database or the user head database of the service server. In addition, the same number of male watercolor special-effect head portraits can be selected from a public material database or a user head database of a service server to serve as first special-effect training head portraits, and then the first initial training head portraits and the first special-effect training head portraits are paired one by one to form a first training set.

It should be noted that after the first initial training avatar is obtained, a plurality of parallel special effect processing algorithms pre-established for realizing a certain special effect are used to perform special effect processing on each first initial training avatar to obtain a plurality of special effect training avatars corresponding to the first initial training avatar, and one first special effect training avatar with the best effect is selected from the plurality of special effect training avatars and is paired with the first initial training avatar to obtain a first training set.

After the first training set is constructed, the deep learning model can be trained through the first training set, and a first processing model for converting the initial head portrait of the first gender into the special-effect head portrait is obtained after training is completed. The deep learning model may be a Convolutional Neural Networks (CNN) model.

Specifically, the training process may include: inputting a first initial training head portrait in a first training set into a deep learning model, calculating a difference value of similarity between an output result of the deep learning model and a first special-effect training head portrait corresponding to the first initial training head portrait, solving an optimal parameter of the model according to the difference value and a preset loss function (loss function), inputting another first initial training head portrait after the model is subjected to parameter adjustment, calculating a corresponding difference value, and realizing parameter optimization of the model until the calculated difference value is small enough after multiple iterations, thereby achieving the purpose of training the deep learning model.

The loss function is a function that maps the value of a random event or its related random variable to a non-negative real number to represent the "risk" or "loss" of the random event.

In the embodiment of the invention, there are great differences in image characteristics between the head portraits of different genders, for example, the head portraits of female genders are mostly long hairs, and the head portraits of male genders are mostly short hairs. Therefore, the processing models corresponding to different sexes are obtained by training the training sets of different sexes independently, so that the processing models can process the head portrait pictures of the corresponding sexes more intensely.

And 102, constructing a second processing model through a second training set.

And the second training set comprises a second initial training head portrait of the second gender and a second special-effect training head portrait corresponding to the second initial training head portrait.

The step may specifically refer to the related description in step 101, and is not described herein again.

And 103, acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait.

In the embodiment of the invention, the initial head portrait may be a head portrait photo shot by the user through a camera of the terminal, and in addition, the initial head portrait may also be a head portrait photo selected by the user from an album or a network of the terminal.

Specifically, after the initial head portrait is obtained, the gender corresponding to the initial head portrait can be obtained through a preset classifier.

The classifier can extract image features of a face region in the initial head portrait, similarity calculation is carried out on the image features and a preset first gender feature template and a preset second gender feature template, and if the similarity between the image features and the first gender feature template is larger than the similarity between the image features and the second gender feature template, the gender corresponding to the initial head portrait is considered as the first gender; and if the similarity between the image features and the second gender feature template is greater than the similarity between the image features and the first gender feature template, the gender corresponding to the initial head portrait is considered as the second gender.

And 104, inputting the initial head portrait into the first processing model to obtain a first special-effect head portrait under the condition that the gender is the first gender.

In this step, according to the first processing model obtained in step 101, the initial avatar of the first gender may be used as a model input, the first special-effect avatar corresponding to the initial avatar of the first gender may be output, and the first special-effect avatar may be returned to the user client.

It should be noted that a plurality of different sub-processing models may be built in the first processing model, and all of the sub-processing models take the initial avatar as input and output the special effect avatar according to respective corresponding algorithms. Therefore, the first processing model can output a plurality of first special effect head portraits corresponding to the initial head portraits in a parallel processing mode, and return the plurality of first special effect head portraits to the user client side so that the user can select a satisfactory first special effect head portrait for use.

Therefore, in practical application, aiming at the initial head portrait of the male, a processing model corresponding to the sex of the male is adopted, so that the obtained special-effect head portrait is more matched with the sex characteristic of the male; aiming at the female initial head portrait, a processing model corresponding to female gender is adopted, and the obtained special-effect head portrait can be better matched with female gender characteristics. If in the process of carrying out water colorization treatment on the initial head portrait, a treatment model corresponding to male gender is adopted, so that the treatment effect on short hair in the obtained special-effect head portrait is better; by adopting the processing model corresponding to the female gender, the effect of processing the long hair in the obtained special effect head portrait is better.

And 105, inputting the initial head portrait into the second processing model to obtain a second special-effect head portrait under the condition that the gender is a second gender.

The step may specifically refer to the related description in step 104, and is not described herein again.

To sum up, the image processing method based on the avatar provided by the embodiment of the present invention includes: constructing a first processing model through a first training set; constructing a second processing model through a second training set; acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait; under the condition that the gender is the first gender, inputting the initial head portrait into a first processing model to obtain a first special-effect head portrait; under the condition that the gender is the second gender, inputting the initial head portrait into a second processing model to obtain a second special-effect head portrait; the first training set comprises a first initial training head portrait of a first gender and a first special-effect training head portrait corresponding to the first initial training head portrait; the second training set comprises a second initial training avatar of a second gender and a second special effect training avatar corresponding to the second initial training avatar. According to the invention, the processing models corresponding to different sexes can be obtained through the training sets of different sexes, so that the processing models can process the head portrait pictures of the corresponding sexes more intensely, the processing models can correspondingly generate high-quality characteristic effects according to the sexes of the initial head portraits of the users, and the special effect processing quality of the head portraits of the users is improved.

Fig. 2 is a flowchart illustrating steps of another avatar-based image processing method according to an embodiment of the present invention, where as shown in fig. 2, the method may include:

step 201, obtaining the first initial training avatar, the first special effect training avatar, the second initial training avatar, and the second special effect training avatar.

Optionally, in an implementation manner of the embodiment of the present invention, step 201 may specifically include:

sub-step 2011, generating a random noise signal.

In the embodiment of the present invention, the noise signal refers to data in which an error or an anomaly (deviation from an expected value) exists, and in the embodiment of the present invention, the random noise signal may be gaussian noise (gaussian noise), where gaussian noise refers to a type of noise whose probability density function follows a gaussian distribution (i.e., a normal distribution), and gaussian noise is also a common noise of the digital image. Common gaussian noise includes heave noise, cosmic noise, thermal noise, shot noise, and so on. In the embodiment of the present application, gaussian noise may exist in the form of a noise image, and may be processed as a corresponding first initial training avatar or second initial training avatar by the countermeasure generating network model. Of course, in the embodiment of the present application, other types of noise, such as rayleigh noise, gamma noise, etc., may also be used.

Substep 2012, inputting the random noise signal, a first real avatar of a preset first gender and a second real avatar of a preset second gender into a confrontation type generation network model, and outputting the first initial training avatar and the second initial training avatar.

Specifically, the countermeasure generation network (GAN) model may include a generator and a discriminator, and is a deep learning model, and the countermeasure generation network model passes through (at least) two modules in its framework: the game learning of the generator and the arbiter with each other produces the preferred output.

In the process of training the confrontation type generation network model, two sample image sets which correspond to different sexes and respectively comprise a plurality of real head portraits can be determined firstly, so that the discriminator can call the real head portraits in the sample image sets. Then, the acquired random noise signal can be input into a generator in an image format, the generator can generate a candidate image through the noise data, due to the limitation of the parameters of the generator, the candidate image has two conditions of "true" and "false", the "true" condition means that the candidate image is highly similar to the real head portrait, the "false" condition means that the candidate image is highly dissimilar to the real head portrait, and in order to finally generate a better quality initial training head portrait, a better generator should continuously optimize the parameters of the generator so that the probability of generating the "true" candidate image is greater than a set threshold value.

Optionally, sub-step 2012 may specifically include:

substep 20121, inputting the random noise signal, the first real avatar, and the second real avatar into the impedance-type generation network model.

Sub-step 20122, in the countermeasure generation network model, generating a candidate image based on the random noise signal.

Sub-step 20123, in case the similarity value between the image feature of the candidate image and the image feature of the first real avatar is greater than or equal to a first preset threshold, determining the candidate image as the first initial training avatar.

Sub-step 20124, in case the similarity value between the image feature of the candidate image and the image feature of the second real avatar is greater than or equal to a second preset threshold, determining the candidate image as the second initial training avatar.

In this embodiment of the application, the discriminator may be configured to discriminate whether the candidate image output by the generator is "true" or "false", specifically, match the candidate image output by the generator with the real avatar, judge whether the candidate image is highly similar to the real avatar, and if so, determine that the candidate image is "true", and is the real avatar; if the heights are not similar, the candidate image is determined to be "false" and not a true avatar. In practical application, the real head portrait and the candidate image can be respectively sampled and then input into a discriminator, the discriminator judges whether the candidate image is true or not according to two sampling results, the generator can deceive the discriminator after multiple rounds of iterative training so as to achieve the purpose of improving the output quality of the generator, meanwhile, the discriminator can gradually improve the performance of the discriminator in the multiple rounds of iterative training so as to achieve the purpose of accurately judging whether the candidate image is true or false, and the probability that the discriminator judges whether the candidate image is the true characteristic image or the false characteristic image is greater than a set threshold.

Further, the countermeasure generation network model may designate a first real avatar of a first gender as a first verification set, and designate a second real avatar of a second gender as a second verification set, and then the discriminator performs matching verification on the candidate image output by the generator with the first verification set and the second verification set, respectively, to determine whether the candidate image is highly similar to the first verification set or the second verification set, and if the candidate image is highly similar to the candidate image and the first real avatar, the candidate image is determined as the first initial training avatar, and if the candidate image is highly similar to the candidate image and the second real avatar, the candidate image is determined as the second initial training avatar. Specifically, the determination of whether the candidate image is highly similar to the first verification set or the second verification set may be performed by calculating a similarity value between the image features of the candidate image and the image features of the first verification set or the second verification set.

After training is completed, the anti-type generation network model can achieve the purpose of generating a first initial training avatar and a second initial training avatar according to the random noise signal.

and a substep 2013 of acquiring an initial special effect training head portrait from a preset material database.

In embodiments of the present invention, the initial special effect training avatar may be derived from a database of locally or over-the-network open-sourced materials.

And a substep 2014 of deleting the initial special effect training avatar with the resolution less than or equal to the preset threshold.

In this step, in order to obtain a standard initial special effect training avatar and avoid the situation that the initial special effect training avatar is not clear due to too small resolution, a preset threshold may be set, and the initial special effect training avatar with the resolution less than or equal to the preset threshold in the obtained initial special effect training avatar is deleted.

Sub-step 2015, cutting the size of the remaining initial special effect training avatar to a preset size.

In this step, the size of the remaining initial special effect training avatar may also be cut to a preset size to improve the normalization of the initial special effect training avatar, so that the subsequent training operation is performed in a normalized size format to improve the precision of the training operation.

Substep 2016, determining a gender corresponding to the initial special effect training avatar.

Specifically, after the initial special-effect training head portrait is obtained, the gender corresponding to the initial special-effect training head portrait can be obtained through a preset classifier.

The classifier can extract image features of a face region in the initial special-effect training head portrait, similarity calculation is carried out on the image features and a preset first gender feature template and a preset second gender feature template, and if the similarity between the image features and the first gender feature template is larger than the similarity between the image features and the second gender feature template, the gender corresponding to the initial special-effect training head portrait is considered as the first gender; and if the similarity between the image features and the second gender feature template is greater than the similarity between the image features and the first gender feature template, the gender corresponding to the initial special effect training avatar is considered as the second gender.

Substep 2017, determining the initial special-effect training avatar belonging to the first gender as the first special-effect training avatar.

And a substep 2018 of determining the initial special-effect training avatar belonging to the second gender as the second special-effect training avatar.

Step 202, an attention network framework is generated according to the first training set and the unsupervised generation, and the first processing model is constructed.

In the embodiment of the present invention, in the process of constructing the first processing model through the first training set, the framework may be based on an Unsupervised generated attention network framework (ugatt, Unsupervised-generated attention-Networks-with-a-digital-Layer-Instance-Normalization-for-Image-t o-Image-transformation), and is used for implementing an Unsupervised Image-to-Image conversion method.

Specifically, the unsupervised generation attention network framework combines a new attention mechanism and a new learnable normalization function in an end-to-end manner, the attention mechanism focuses on more important areas for distinguishing a source domain and a target domain based on an attention map obtained by an auxiliary classifier, so that the constructed model can convert images needing overall change and images needing large deformation, and the new adaptive layer instance standardization function can flexibly control the variation of the shape and texture of the images through learning parameters according to a data set.

And 203, constructing the second processing model according to the second training set and the unsupervised generation attention network framework.

The step may specifically refer to the related description in step 202, and is not described herein again.

And 204, acquiring an initial head portrait, and identifying a face area of the initial head portrait.

In this step, the face region of the initial avatar may be identified by face recognition techniques. The face recognition technology is a biometric technology for identifying an identity based on facial feature information of a person, and comprises the following steps: the method comprises four steps of face image acquisition and detection, face image preprocessing, face image feature extraction and matching and identification.

And step 205, inputting the face region into a preset classifier to obtain the gender corresponding to the initial head portrait.

In the embodiment of the invention, the classifier can extract the image characteristics of the face region in the initial head portrait, and carry out similarity calculation on the image characteristics and a preset first gender characteristic template and a preset second gender characteristic template, if the similarity between the image characteristics and the first gender characteristic template is greater than the similarity between the image characteristics and the second gender characteristic template, the gender corresponding to the initial head portrait is considered as the first gender; and if the similarity between the image features and the second gender feature template is greater than the similarity between the image features and the first gender feature template, the gender corresponding to the initial head portrait is considered as the second gender.

Optionally, step 205 may specifically include:

sub-step 2051, identifying the face region of the initial avatar.

Sub-step 2052, determining the size of each face area if the number of face areas is multiple.

In the embodiment of the present invention, a plurality of face regions often exist in an initial head portrait acquired by a user for various reasons, for example, when the user shoots the initial head portrait, the face of another person exists in a shooting range.

Under the condition that the number of the face areas in the initial head portrait is determined to be multiple through face recognition, the area size of each face area can be calculated sequentially, and the area size of each face area is obtained.

And a substep 2053 of inputting the face region with the largest region size into the classifier to obtain the gender corresponding to the initial head portrait.

In the embodiment of the present invention, for a scene in which a user sets an avatar, the user often wants to have a face region that can be focused on in an initial avatar to indicate user attributes of the user through the face region, and often the face region that is focused on in a picture may be the face region with the largest region size in the picture, so that the face region with the largest region size may be input into a classifier to obtain a gender corresponding to the initial avatar, and subsequent model processing is performed through the gender corresponding to the face region with the largest size.

For example, a user wants to take a self-portrait group photo of a plurality of buddies taken by the user as a head portrait, and during the shooting process, the distance from the user to the camera is smaller than the distance from other people to the camera, so that the area size of the face area of the user in the shot picture is the largest.

And step 206, inputting the initial head portrait into the first processing model under the condition that the gender is the first gender to obtain a first special-effect head portrait.

And step 207, inputting the initial head portrait into the second processing model to obtain a second special-effect head portrait under the condition that the gender is the second gender.

The step may specifically refer to the related description in step 105, and is not described herein again.

In summary, the image processing method based on the avatar provided in the embodiment of the present invention includes: constructing a first processing model through a first training set; constructing a second processing model through a second training set; acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait; under the condition that the gender is the first gender, inputting the initial head portrait into a first processing model to obtain a first special-effect head portrait; under the condition that the gender is the second gender, inputting the initial head portrait into a second processing model to obtain a second special-effect head portrait; the first training set comprises a first initial training head portrait of a first gender and a first special-effect training head portrait corresponding to the first initial training head portrait; the second training set comprises a second initial training avatar of a second gender and a second special effect training avatar corresponding to the second initial training avatar. According to the invention, the processing models corresponding to different sexes can be obtained through the training sets of different sexes, so that the processing models can process the head portrait pictures of the corresponding sexes more intensely, the processing models can correspondingly generate high-quality characteristic effects according to the sexes of the initial head portraits of the users, and the special effect processing quality of the head portraits of the users is improved.

Fig. 3 is a block diagram of an image processing apparatus based on avatar according to an embodiment of the present invention, and as shown in fig. 3, the apparatus 30 may include:

a first establishing module 301, configured to establish a first processing model through a first training set;

optionally, the first establishing module 301 includes:

the first construction sub-module is used for constructing the first processing model according to the first training set and an unsupervised attention network generation framework;

a second building module 302, configured to build a second processing model through a second training set;

optionally, the second establishing module 302 includes:

and the second construction sub-module is used for constructing the second processing model according to the second training set and the unsupervised attention network generation framework.

A first obtaining module 303, configured to obtain an initial head portrait and identify a gender corresponding to the initial head portrait;

optionally, the first obtaining module 303 includes:

the recognition submodule is used for recognizing the face area of the initial head portrait;

the size determination submodule determines the area size of each face area under the condition that the number of the face areas is multiple;

and the classification submodule inputs the face region with the largest region size into a classifier to obtain the gender corresponding to the initial head portrait.

A first processing module 304, configured to input the initial avatar into the first processing model to obtain a first special-effect avatar when the gender is a first gender;

a second processing module 305, configured to, when the gender is a second gender, input the initial avatar into the second processing model to obtain a second special-effect avatar;

Optionally, the apparatus further comprises:

a generating module for generating a random noise signal;

and the third processing module is used for inputting the random noise signal, a first real head portrait of a preset first gender and a second real head portrait of a preset second gender into a confrontation type generation network model and outputting the first initial training head portrait and the second initial training head portrait.

Optionally, the third processing module includes:

the input submodule is used for inputting the random noise signal, the first real head portrait and the second real head portrait into the impedance generation network model;

the generation submodule is used for generating a candidate image according to the random noise signal in the countermeasure generation network model;

a first matching sub-module, configured to determine the candidate image as the first initial training avatar if a similarity value between an image feature of the candidate image and an image feature of the first real avatar is greater than or equal to a first preset threshold;

a second matching sub-module, configured to determine the candidate image as the second initial training avatar if a similarity value between the image feature of the candidate image and the image feature of the second real avatar is greater than or equal to a second preset threshold.

In summary, the image processing apparatus based on avatar provided in the embodiment of the present invention includes: constructing a first processing model through a first training set; constructing a second processing model through a second training set; acquiring an initial head portrait and identifying the gender corresponding to the initial head portrait; under the condition that the gender is the first gender, inputting the initial head portrait into a first processing model to obtain a first special-effect head portrait; under the condition that the gender is the second gender, inputting the initial head portrait into a second processing model to obtain a second special-effect head portrait; the first training set comprises a first initial training head portrait of a first gender and a first special-effect training head portrait corresponding to the first initial training head portrait; the second training set comprises a second initial training avatar of a second gender and a second special effect training avatar corresponding to the second initial training avatar. According to the invention, the processing models corresponding to different sexes can be obtained through the training sets of different sexes, so that the processing models can process the head portrait pictures of the corresponding sexes more intensely, the processing models can correspondingly generate high-quality characteristic effects according to the sexes of the initial head portraits of the users, and the special effect processing quality of the head portraits of the users is improved.

For the above device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, refer to the partial description of the method embodiment.

Preferably, an embodiment of the present invention further provides a terminal, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the image processing method embodiment based on the avatar, and can achieve the same technical effect, and details are not repeated here to avoid repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the image processing method embodiment based on the avatar, and can achieve the same technical effect, and in order to avoid repetition, the computer program is not described herein again. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As is readily imaginable to the person skilled in the art: any combination of the above embodiments is possible, and thus any combination between the above embodiments is an embodiment of the present invention, but the present disclosure is not necessarily detailed herein for reasons of space.

The avatar-based image processing methods provided herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The structure required to construct a system incorporating aspects of the present invention will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components of the avatar-based image processing method according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims

1. An avatar-based image processing method, the method comprising:

constructing a first processing model through a first training set;

constructing a second processing model through a second training set;

2. The method of claim 1, wherein prior to said constructing a first process model from a first training set, the method further comprises:

generating a random noise signal;

inputting the random noise signal, a first preset real head portrait of a first gender and a second preset real head portrait of a second gender into a confrontation type generation network model, and outputting the first initial training head portrait and the second initial training head portrait.

3. The method of claim 2, wherein inputting the random noise signal, a first real avatar of a preset first gender, and a second real avatar of a preset second gender into a confrontational generation network model, and outputting the first initial training avatar and the second initial training avatar comprises:

inputting the random noise signal, the first real head portrait and the second real head portrait into the impedance generation network model;

in the countermeasure generation network model, generating a candidate image according to the random noise signal;

determining the candidate image as the first initial training avatar if a similarity value between the image features of the candidate image and the image features of the first real avatar is greater than or equal to a first preset threshold;

determining the candidate image as the second initial training avatar if the similarity value between the image features of the candidate image and the image features of the second real avatar is greater than or equal to a second preset threshold.

4. The method of claim 1, wherein constructing the first process model from the first training set comprises:

constructing the first processing model according to the first training set and an unsupervised attention network generating framework;

the constructing of the second process model by the second training set comprises:

and constructing the second processing model according to the second training set and the unsupervised generation attention network framework.

5. The method of claim 1, wherein the identifying the gender corresponding to the initial avatar comprises:

identifying a face region of the initial head portrait;

determining the area size of each face area under the condition that the number of the face areas is multiple;

and inputting the face region with the largest region size into a classifier to obtain the gender corresponding to the initial head portrait.

6. An avatar-based image processing apparatus, the apparatus comprising:

7. The apparatus of claim 6, further comprising:

a generating module for generating a random noise signal;

8. The apparatus of claim 7, wherein the third processing module comprises:

9. The apparatus of claim 6, wherein the first establishing module comprises:

the second establishing module comprises:

10. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the avatar-based image processing method as claimed in any one of claims 1 to 5.