CN113313635A

CN113313635A - Image processing method, model training method, device and equipment

Info

Publication number: CN113313635A
Application number: CN202010120806.XA
Authority: CN
Inventors: 任健强
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-02-26
Filing date: 2020-02-26
Publication date: 2021-08-27

Abstract

The embodiment of the invention provides an image processing method, a model training method, a device and equipment, wherein the image processing method comprises the following steps: acquiring an image to be processed; determining image coding information corresponding to an image to be processed, wherein the image coding information is used for identifying information included in the image to be processed; determining at least one reference image for processing an image to be processed based on image coding information, wherein the at least one reference image meets a preset condition; and performing image processing on the image to be processed by utilizing at least one reference image to obtain a target image corresponding to the image to be processed. According to the technical scheme provided by the embodiment, the image coding information corresponding to the image to be processed is determined, the at least one reference image is determined based on the image coding information, and then the image to be processed is processed by using the at least one reference image, so that the quality and the effect of processing the image are ensured, and the difficulty degree of processing the image is reduced.

Description

Image processing method, model training method, device and equipment

Technical Field

The invention relates to the technical field of image processing, in particular to an image processing method, a model training device and image processing equipment.

Background

In the field of image processing technology, there are wide application scenarios for sharpness enhancement processing of blurred face images in images or videos, for example: in monitoring security, the enhancement of the low-definition face image can assist in judging the identity of people in monitoring, or the restoration processing of the face image in an old photo and an old movie and television play can not only improve the quality of media, but also improve the watching experience of audiences. In many current face restoration schemes, it is a feasible technical solution to utilize details in a high-definition face reference image to compensate a low-definition face image, but the following constraints are limited: it is very difficult to find a high-definition face reference image similar to the structural appearance of the low-definition face image, and the enhancement effect of the face image can be obviously influenced by the similarity of the structural appearance.

Disclosure of Invention

The embodiment of the invention provides an image processing method, a model training device and equipment, which can ensure the quality and effect of processing a face image, reduce the difficulty degree of processing the face image and enable the image processing method to be widely applied to various application scenes.

In a first aspect, an embodiment of the present invention provides an image processing method, including:

acquiring an image to be processed;

determining image coding information corresponding to the image to be processed, wherein the image coding information is used for identifying information included in the image to be processed;

determining at least one reference image for processing the image to be processed based on the image coding information, wherein the at least one reference image meets a preset condition;

and performing image processing on the image to be processed by using the at least one reference image to obtain a target image corresponding to the image to be processed.

In a second aspect, an embodiment of the present invention provides an image processing apparatus, including:

the image coding information is used for identifying information included in the image to be processed;

a first determining module, configured to determine image coding information corresponding to the image to be processed;

the first determining module is configured to determine, based on the image coding information, at least one reference image for processing the image to be processed, where the at least one reference image satisfies a preset condition;

and the first processing module is used for carrying out image processing on the image to be processed by utilizing the at least one reference image to obtain a target image corresponding to the image to be processed.

In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, implement the image processing method of the first aspect.

In a fourth aspect, an embodiment of the present invention provides a computer storage medium for storing a computer program, where the computer program is used to make a computer implement the image processing method in the first aspect when executed.

In a fifth aspect, an embodiment of the present invention provides a model training method, including:

acquiring a plurality of first images and a plurality of image codes corresponding to the plurality of first images, wherein the image codes are used for identifying information included in the first images;

processing the plurality of image codes by using a convolutional neural network to obtain a plurality of second images;

determining the definition of the first image, the definition of the second image and the similarity of the first image and the second image;

and when the definition of the second image is different from that of the first image and the similarity is greater than or equal to a preset threshold value, generating a machine learning model.

In a sixth aspect, an embodiment of the present invention provides a model training apparatus, including:

a second obtaining module, configured to obtain a plurality of first images and a plurality of image codes corresponding to the plurality of first images, where the image codes are used to identify information included in the first images;

the second processing module is used for processing the image codes by utilizing a convolutional neural network to obtain a plurality of second images;

a second determining module, configured to determine a sharpness of the first image, a sharpness of the second image, and a similarity between the first image and the second image;

and the second generation module is used for generating a machine learning model when the definition of the second image is different from that of the first image and the similarity is greater than or equal to a preset threshold value.

In a seventh aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, implement the model training method of the fifth aspect.

In an eighth aspect, an embodiment of the present invention provides a computer storage medium for storing a computer program, where the computer program is used to enable a computer to implement the model training method in the fifth aspect when executed.

According to the image processing method, the model training device and the image processing equipment, the image coding information corresponding to the image to be processed is determined by obtaining the image to be processed, the at least one reference image used for processing the image to be processed is determined based on the image coding information, then the image to be processed is processed by utilizing the at least one reference image, and the target image corresponding to the image to be processed is obtained.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of inputting the at least one reference image and the image to be processed into a third machine learning model according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of a process of acquiring a fusion feature image corresponding to the at least one reference image and the image to be processed according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating another image processing method according to an embodiment of the present invention;

fig. 5 is a first schematic diagram illustrating an image processing method according to an embodiment of the present invention;

FIG. 6 is a second schematic diagram of an image processing method according to an embodiment of the present invention;

fig. 7 is a third schematic diagram of an image processing method according to an embodiment of the present invention;

fig. 8 is a fourth schematic diagram of an image processing method according to an embodiment of the present invention;

FIG. 9 is a schematic flow chart of a model training method according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an electronic device corresponding to the image processing apparatus provided in the embodiment shown in fig. 10;

FIG. 12 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention;

fig. 13 is a schematic structural diagram of an electronic device corresponding to the model training apparatus provided in the embodiment shown in fig. 12.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and "a" and "an" generally include at least two, but do not exclude at least one, unless the context clearly dictates otherwise.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.

In addition, the sequence of steps in each method embodiment described below is only an example and is not strictly limited.

In order to facilitate understanding of the technical solutions of the present application, the following briefly describes the prior art: in the field of image processing technology, details in a high-definition face reference image are generally used to compensate for a low-definition face image. However, when image enhancement is performed, the reference image is usually a high-definition face image of the same person as the low-definition face image, which greatly limits the application range of the image enhancement method, for example: in a monitoring scene, a high-definition face image of a target person cannot be obtained in advance, so that image enhancement processing on a low-definition face image cannot be realized; in addition, in a scene in which an old person is repaired, it is difficult to obtain a high-definition face image of a person in the image, and therefore image enhancement processing of a low-definition face image cannot be realized.

Some embodiments of the invention are described in detail below with reference to the accompanying drawings. The features of the embodiments and examples described below may be combined with each other without conflict between the embodiments.

Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention; referring to fig. 1, the present embodiment provides an image processing method, the execution subject of the method may be an image processing apparatus, and it is understood that the enhancement apparatus may be implemented as software, or a combination of software and hardware. Specifically, the enhancing method may include:

step S101: and acquiring an image to be processed.

Step S102: determining image coding information corresponding to an image to be processed, wherein the image coding information is used for identifying information included in the image to be processed.

Step S103: determining at least one reference image for processing the image to be processed based on the image coding information, wherein the at least one reference image satisfies a preset condition.

Step S104: and performing image processing on the image to be processed by utilizing at least one reference image to obtain a target image corresponding to the image to be processed.

The following is a detailed description of the above steps:

step S101: and acquiring an image to be processed.

Wherein the image to be processed is a biological face image which needs to be subjected to image processing, it is understood that the image processing may include at least one of the following: the image processing method includes image enhancement processing, image blurring processing, image rendering processing, image editing processing and the like, specifically, the image enhancement processing may increase the definition of the image to be processed, the image blurring processing may reduce the definition of the image to be processed, the image rendering processing may perform rendering processing such as whitening and beautifying on an object in the image to be processed, and the image editing processing may perform various types of editing operations on the image to be processed, for example, filtering processing of the image, texture processing of the image, clipping processing of the image and the like.

In addition, the biological face image may refer to: face images, cat face images, dog face images, or a biological face avatar of other living being, etc. The image to be processed may include at least one of: image information obtained by photographing by a photographing device, image information in video information, a composite image, and the like. The embodiment does not limit the specific implementation manner of the enhancement device for acquiring the image to be processed, and a person skilled in the art may set the enhancement device according to specific application requirements and design requirements, for example: the shooting device may be in communication connection with the enhancing device, and after the shooting device shoots and obtains the image to be processed, the enhancing device may obtain the image to be processed through the shooting device, specifically, the enhancing device may actively obtain the image to be processed obtained by the shooting device, or the shooting device may actively send the image to be processed to the enhancing device, so that the enhancing device may obtain the image to be processed. Still alternatively, the image to be processed may be stored in a preset area, and the enhancement device may obtain the image to be processed by accessing the preset area.

After obtaining the image to be processed, the image to be processed may be analyzed, so that image coding information corresponding to the image to be processed may be determined, the image coding information being used to identify information included in the image to be processed, the information may include at least one of: color information, pixel information, texture information, shape information, and the like. Specifically, determining the image coding information corresponding to the image to be processed may include:

step S1021: and extracting image characteristic information of the image to be processed.

After the image to be processed is acquired, analyzing the image to be processed to extract image feature information of the image to be processed, where the image feature information may include at least one of: shallow feature information, deep feature information. It is understood that the image feature information is not limited to the shallow feature information and the deep feature information defined above, and those skilled in the art can set the image feature information according to specific application requirements and design requirements, for example: the image feature information may also include fusion feature information, etc., which will not be described in detail herein. Specifically, extracting image feature information of the image to be processed may include:

step S10211: and processing the image to be processed by utilizing a first machine learning model to obtain image characteristic information corresponding to the image to be processed, wherein the first machine learning model is trained to be used for extracting the image characteristic information of the image.

The first machine learning model may be image feature information previously trained to extract an image, and the image feature information may include at least one of: shallow feature information, deep feature information. In addition, the first machine learning model may be established based on a convolutional neural network when the first machine learning model is learning trained. Therefore, after the first machine learning model is established, the image to be processed can be analyzed and processed by using the first machine learning model, so that the image characteristic information corresponding to the image to be processed can be obtained.

In the embodiment, the trained first machine learning model is used for analyzing and processing the image to be processed to obtain the image characteristic information corresponding to the image to be processed, so that the accuracy and the reliability of obtaining the image characteristic information are effectively ensured, the quality and the efficiency of obtaining the image characteristic information are also ensured, and the stability and the reliability of the method are further improved.

Step S1022: the image feature information is determined as image encoding information corresponding to the image to be processed.

After the image characteristic information is acquired, the image characteristic information can be determined as the image coding information corresponding to the image to be processed, so that the accuracy and reliability of acquiring the image coding information are effectively ensured.

Wherein the preset condition may include at least one of: the definition of the at least one reference image is different from that of the image to be processed, and the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold value. It is to be understood that the relationship between the image sharpness of the at least one reference image and the sharpness of the image to be processed may include: the definition of the at least one reference image is higher than that of the image to be processed, or the definition of the at least one reference image is lower than that of the image to be processed.

And, in different application scenarios, the preset conditions may include different limiting conditions. For example: in an application scene where the definition of the at least one reference image to be selected is high, the preset condition may include that the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold; or, when the at least one reference image and the image to be processed are directed to the same class of object or similar class of object, the preset condition may include that the definition of the at least one reference image is higher than that of the image to be processed; or, in an application scene that the image needs to be enhanced, the definition of at least one reference image may be higher than that of the image to be processed; in an application scenario where the image needs to be blurred, the sharpness of the at least one reference image may be lower than the sharpness of the image to be processed.

Specifically, after obtaining the image coding information, the image coding information may be analyzed to determine at least one reference image for processing the image to be processed, where determining the at least one reference image for processing the image to be processed based on the image coding information in this embodiment may include:

step S1031: and processing the image coding information by utilizing a second machine learning model to obtain at least one reference image for processing the image to be processed, wherein the second machine learning model is trained to be used for determining at least one reference image for processing the image to be processed based on the image coding information, the definition of the at least one reference image is different from that of the image to be processed, and the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold value.

The second machine learning model may be pre-trained to determine at least one reference image for processing the image to be processed based on the image coding information, where a definition of the at least one reference image is different from a definition of the image to be processed, and a similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold. It is to be understood that the second machine learning model is trained for determining the at least one reference image for enhancement processing of the image to be processed based on the image coding information when the sharpness of the at least one reference image is higher than the sharpness of the image to be processed. When the definition of the at least one reference image is lower than that of the image to be processed, the second machine learning model is trained to determine the at least one reference image used for blurring the image to be processed based on the image coding information.

In addition, when the image to be processed is a face image to be processed, the similarity between the at least one reference image and the image to be processed may include: and the similarity between the structure and the appearance of the face in the at least one reference image and the structure and the appearance of the face in the image to be processed. The structure of the human face comprises at least one of the following components: face orientation (forward, left, right, etc.), pose (head up, head down, etc.), position information of the face relative to the image (center position, left position, right position, etc.); the appearance of the human face includes at least one of: hair features, skin tone features, brightness features, color features.

In addition, when the second machine learning model is subjected to learning training, the learning training can be performed based on the convolutional neural network, so that the second machine learning model is established. Therefore, after the second machine learning model is established, the image coding information can be analyzed and processed by using the second machine learning model, so that at least one reference image for processing the image to be processed can be obtained, and it should be noted that the target included in the at least one reference image can be the same as or different from the target included in the image to be processed.

In some examples, the number of the at least one reference picture may be one or more, namely: when the second machine learning model is used to analyze and process the image coding information, at least one reference image for processing the image to be processed may be obtained, and it is understood that the similarity between different reference images and the image to be processed may be the same or different. After the similarity between the image to be processed and different reference images is obtained, the at least one reference image can be sorted based on the similarity, so that a sorting queue of the at least one reference image based on different similarities can be obtained, a reference image with the highest similarity can be obtained based on the sorting queue, the reference image can be used as a final target reference image, the image to be processed can be further processed by using the target reference image, and the quality and the effect of image processing can be effectively guaranteed.

In this embodiment, the trained second machine learning model is used to analyze and process the image coding information to obtain at least one reference image for processing the image to be processed, so that the quality and efficiency of obtaining the at least one reference image are effectively ensured, and the quality and efficiency of applying the image processing method are further improved.

After the at least one reference image is acquired, image processing may be performed on the image to be processed using the at least one reference image, so that a target image corresponding to the image to be processed may be obtained. Specifically, the image processing of the image to be processed by using at least one reference image, and obtaining the target image corresponding to the image to be processed may include:

step S1041: and inputting the at least one reference image and the image to be processed into a third machine learning model to perform image processing on the image to be processed based on the at least one reference image by using the third machine learning model to obtain a target image corresponding to the image to be processed, wherein the third machine learning model is trained to perform image processing on the image to be processed based on the at least one reference image, and the definition of the at least one reference image is different from that of the image to be processed.

The third machine learning model may be trained in advance to perform image processing on the image to be processed based on at least one reference image, the definition of the at least one reference image is different from that of the image to be processed, and the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold. It is to be understood that the third machine learning model is trained for enhancement processing of the image to be processed when the sharpness of the at least one reference image is higher than the sharpness of the image to be processed. When the definition of the at least one reference image is lower than that of the image to be processed, the third machine learning model is trained to perform fuzzy processing on the image to be processed.

Specifically, the image to be processed may include a face image to be processed, and at this time, the similarity between the at least one reference image and the image to be processed may include: and the similarity between the structure and the appearance of the face in the at least one reference image and the structure and the appearance of the face in the image to be processed. The structure of the human face comprises at least one of the following components: face orientation, pose, position information of the face relative to the image; the appearance of the human face includes at least one of: hair features, skin tone features, brightness features, color features.

In addition, when the third machine learning model is subjected to learning training, the learning training can be performed based on the convolutional neural network, so that the third machine learning model can be generated and established. Therefore, after obtaining the at least one reference image and the image to be processed, the at least one reference image and the image to be processed may be input to the third machine learning model to perform image processing on the image to be processed based on the at least one reference image by using the third machine learning model, so that a target image corresponding to the image to be processed, that is, a sharp image corresponding to the image to be processed, may be obtained.

In the image processing method provided by this embodiment, an image to be processed is obtained, image coding information corresponding to the image to be processed is determined, at least one reference image for processing the image to be processed is determined based on the image coding information, and then the image to be processed is processed by using the at least one reference image, so as to obtain a target image corresponding to the image to be processed, which not only ensures the effect of processing the image to be processed, but also reduces the difficulty of processing the image to be processed, so that the image processing method can be applied to various application scenes, and the practicability of the enhancement method is further improved.

Fig. 2 is a schematic flow chart of inputting at least one reference image and an image to be processed into a third machine learning model according to an embodiment of the present invention; on the basis of the foregoing embodiment, referring to fig. 2, in this embodiment, a specific implementation manner of inputting at least one reference image and an image to be processed into a third machine learning model is not limited, and a person skilled in the art may set the implementation manner according to specific application requirements and design requirements, and preferably, the inputting at least one reference image and an image to be processed into a third machine learning model in this embodiment may include:

step S201: and acquiring a fusion characteristic image corresponding to at least one reference image and the image to be processed.

Specifically, after obtaining the at least one reference image, the at least one reference image and the to-be-processed image may be analyzed, so that a fused feature image corresponding to the at least one reference image and the to-be-processed image may be obtained, where in some examples, obtaining the fused feature image corresponding to the at least one reference image and the to-be-processed image may include:

step S2011: and splicing the at least one reference image and the image to be processed on the image channel to obtain a fusion characteristic image.

In general, for an image, the image may consist of three image channels, red, green and blue. Therefore, after obtaining the at least one reference image and the to-be-processed image, the at least one reference image and the to-be-processed image may be subjected to stitching processing on the image channel, so that a stitched feature image may be obtained, in which image features of the at least one reference image and the to-be-processed image are fused.

In other examples, referring to fig. 3, acquiring the fused feature image corresponding to the at least one reference image and the image to be processed may further include:

step S2012: acquiring a first characteristic image corresponding to at least one reference image and a second characteristic image corresponding to an image to be processed.

Step S2013: and carrying out fusion processing on the first characteristic image and the second characteristic image to obtain a fusion characteristic image.

After the at least one reference image and the to-be-processed image are acquired, the at least one reference image and the to-be-processed image may be analyzed to acquire a first feature image corresponding to the at least one reference image and a second feature image corresponding to the to-be-processed image. Specifically, the acquiring a first feature image corresponding to at least one reference image and a second feature image corresponding to the image to be processed may include: and similarly, the machine learning model is utilized to analyze the image to be processed to obtain a second characteristic image corresponding to the image to be processed. The machine learning model may be a feature image that is trained in advance to determine a feature image corresponding to an image, and the feature image may include at least one of the following: pixel feature information, color feature information, luminance feature information, texture feature information, edge feature information, shape feature information, semantic feature information. In addition, when the machine learning model is subjected to learning training, the machine learning model can be established based on the convolutional neural network.

Of course, those skilled in the art may also use other manners to obtain the image feature corresponding to the at least one reference image and the feature corresponding to the image to be processed, as long as the accurate reliability of obtaining the image feature can be achieved, which is not described herein again.

After the first characteristic image and the second characteristic image are acquired, the first characteristic image and the second characteristic image can be fused, so that a fused characteristic image can be acquired, wherein the fused characteristic image is fused with at least one reference image and the image characteristics of the image to be processed.

It is to be understood that the manner of obtaining the fusion feature image corresponding to the at least one reference image and the to-be-processed image is not limited to the above-mentioned exemplary manner, and those skilled in the art may also use other manners to obtain the fusion feature image corresponding to the at least one reference image and the to-be-processed image, as long as the accuracy and reliability of obtaining the fusion feature image can be ensured.

Step S202: the fused feature image is input to a third machine learning model.

After the fusion feature image is acquired, the fusion feature image can be input to the third machine learning model, so that the third machine learning model can realize image processing of the image to be processed based on the fusion feature image, and the effect of processing the image to be processed is effectively ensured.

FIG. 4 is a flowchart illustrating another image processing method according to an embodiment of the present invention; on the basis of any of the above embodiments, with reference to fig. 4, the method in this embodiment may further include:

step S401: and acquiring characteristic parameters for identifying the definition of the image to be processed.

Step S402: and when the characteristic parameters meet the preset conditions, determining confidence information between the characteristic parameters and the target image based on the characteristic parameters and at least one reference image.

Step S403: and when the confidence coefficient information is smaller than a preset limit value, ignoring the target image.

Wherein the characteristic parameter may include at least one of: noise parameters, ambiguity parameters, resolution parameters. Specifically, the noise parameter is used to identify unnecessary or redundant interference information in the image data, and the blur degree parameter is used to identify the blur degree of the image data. The resolution parameters may include a display resolution, which is the precision of a screen image, and an image resolution, which is the number of pixels contained in a unit inch.

It should be noted that the feature parameters may include not only the above-mentioned noise parameter, ambiguity parameter and resolution parameter, but also other feature parameters may be configured by those skilled in the art according to specific application requirements and design requirements, for example, the feature parameters may also include color features, texture features, shape features, spatial features, and the like, and will not be described herein again.

After obtaining the feature parameter for identifying the definition of the image to be processed, analyzing and comparing the feature parameter with a preset condition, wherein the preset condition may be pre-configured condition information for identifying that the definition of the image to be processed is lower definition, and when the feature parameter meets the preset condition, it indicates that the definition of the image to be processed is lower; and when the characteristic parameter does not meet the preset condition, the definition of the image to be processed is higher.

It can be understood that when the image to be processed with lower definition is processed, the confidence of the target image after the image processing is lower, and when the image to be processed with higher definition is processed, the confidence of the target image after the image processing is higher.

At this time, in order to facilitate a user to know the confidence of image processing in time, when the feature parameter satisfies a preset condition, the confidence information with the target image is determined based on the feature parameter and at least one reference image, and whether the target image is applicable is determined based on the confidence information. Specifically, when the confidence information of the target image is determined based on the feature parameters and the at least one reference image, the feature parameters and the at least one reference image may be analyzed and processed by using a preset machine learning model, so that the confidence information of the target image may be determined, and then the confidence information may be analyzed and compared with a preset limit value, and when the confidence information is smaller than the preset limit value, it is determined that the confidence of the target image is low, and the target image may be ignored. And when the confidence information is greater than or equal to the preset limit value, the confidence of the target image is higher, and the target image can be stored.

In this embodiment, by obtaining the feature parameter for identifying the sharpness of the image to be processed, when the feature parameter meets the preset condition, the confidence information with the target image is determined based on the feature parameter and the at least one reference image, and when the confidence information is smaller than the preset limit value, the target image is ignored, so that the quality and efficiency of obtaining the target image are effectively ensured, and a user can timely know the confidence information of image processing.

In specific application, referring to fig. 5 to 8, an image processing method is provided in this application embodiment, where the image processing method may be an image processing apparatus, and is described with reference to a processing apparatus capable of enhancing an image, the processing apparatus may include a first machine learning model, a second machine learning model, and a third machine learning model, and in specific application, the first machine learning model may be a coding network for generating coding information of an image to be processed, the second machine learning model may be a clear face generation network established by a Generative Adaptive Network (GAN), and the third machine learning model may be an enhanced network established by a GAN network, as shown in fig. 5. It should be noted that, in a specific application, the first machine learning model and the second machine learning model may be merged into one machine learning model, and the merged machine learning model may process the image to be processed, generate the encoding information of the image to be processed, and determine at least one reference image based on the encoding information, as shown in fig. 6.

For convenience of understanding, a human face image is taken as an example to be described, specifically, when the enhancement method is used to enhance a human face image, a coding network, a clear human face generation network and an enhancement network may be constructed first, and a specific network construction process may include:

firstly, a clear face generation network is constructed and generated.

When a clear face generation network is constructed, a large number of clear face images can be obtained first, and particularly, a large number of clear face images can be obtained through photos, videos or collection in a preset database. After a large number of clear face images are obtained, random coding information Z can be obtained, then, a large number of clear face images and random coding information Z are subjected to learning training based on a GAN network, and a clear face generation network can be constructed and generated, and the clear face generation network can generate a clear face image based on any random coding information Z.

In addition, in order to ensure the quality of the clear face image generated by the clear face generation network, when the clear face generation network is constructed, a plurality of blurred images and a plurality of image codes corresponding to the blurred images may be acquired first. Then, processing the multiple image codes by utilizing a deep convolutional neural network to obtain multiple second images, and then determining the definition of the first image, the definition of the second image and the similarity between the first image and the second image; when the definition of the second image is higher than that of the first image and the similarity is greater than or equal to a preset threshold value, the learning training by using the convolutional neural network can be stopped, and thus a clear face generation network is generated.

And secondly, constructing and generating a coding network.

When the above-mentioned sharp face generation network is used to generate a sharp face image, in order to control the structure and appearance of the generated sharp face image, a coding network may be added before the sharp face generation network, and the coding network may be generated by performing learning training based on a convolutional neural network, specifically, the coding network may encode any blurred face image, and specifically, deep features of a blurred face may be extracted as the coding information z.

After obtaining the coded information z, the coded information z may be input to a clear face generation network, so that the clear face generation network may analyze and process the coded information z, and may further obtain a clear face image with an appearance structure similar to that of the blurred face image, where the clear face image is used as at least one reference image, as shown in fig. 7 to 8.

It should be noted that, when constructing and generating the coding network, the similarity between the clear face image and the blurred face image determined by the clear face generation network based on the coding information z may be considered, and when the similarity between the clear face image and the blurred face image is greater than or equal to a preset threshold, the coding processing on the blurred face may be stopped, and the coding information z corresponding to the blurred face image may be determined, so that the coding network may be generated. After the coding network is constructed and generated, the coding information of the fuzzy image can be generated through the coding network, and after the coding information is input into the clear face generation network, the clear face image with the appearance structure similar to that of the fuzzy face image can be output.

And thirdly, constructing and generating an enhanced network.

The method comprises the steps of obtaining a plurality of fuzzy face images, a plurality of clear face reference images and target images corresponding to the fuzzy face images, and utilizing a deep convolutional neural network to conduct learning training on the fuzzy face images, the clear face reference images and the target images, so that an enhancement network can be generated, and the enhancement network can utilize details of the clear face reference images to conduct image enhancement processing on details of the fuzzy face images, so that a plurality of target images corresponding to the fuzzy face images are obtained.

Based on the above constructed sharp face generation network, coding network and enhancement network, the image processing method provided by the application embodiment can be implemented, and specifically, the method may include:

step 1: acquiring a fuzzy face image;

step 2: inputting the blurred face image into a coding network to obtain image coding information z corresponding to the blurred face image;

and step 3: and inputting the image coding information z into a clear face generation network to obtain a clear face reference image corresponding to the image coding information z, wherein the definition of the clear face reference image is higher than that of the fuzzy face image, and the similarity between the clear face reference image and the fuzzy face image is greater than or equal to a preset threshold value.

And 4, step 4: and inputting the clear face reference image and the fuzzy face image into an enhancement network to obtain a target image corresponding to the fuzzy face image.

When the clear face reference image and the blurred face image are input to the enhancement network, the blurred face image and the clear face reference image may be subjected to feature fusion to obtain a fusion feature image corresponding to the blurred face image and the clear face reference image, specifically, one way to implement the feature fusion of the blurred face image and the clear face reference image is as follows: and splicing the characteristics of the fuzzy face image and the clear face reference image in an image channel so as to obtain a fusion characteristic image corresponding to the fuzzy face image and the clear face reference image. Another way to implement feature fusion of the blurred face image and the clear face reference image is as follows: the method comprises the steps of utilizing a preset machine learning model to conduct feature extraction on a fuzzy face image to obtain first multilevel features, utilizing the machine learning model to conduct feature extraction on a clear face reference image to obtain second multilevel features, and fusing the first multilevel features and the second multilevel features to obtain a fused feature image.

After the fused feature image is input to the enhancement network, the enhancement network can migrate the detail features (which can be partial detail features or all detail features) in the clear face reference image to the blurred face image, so that the quality and effect of enhancement processing on the face image are realized, and the quality and effect of obtaining the target image are further ensured.

The image processing method provided by the application embodiment utilizes the strong generating capacity of the GAN network, can encode the blurred image, and output the clear face image with similar structural appearance, thereby realizing automatic and efficient acquisition of the face reference image, and then can utilize the face reference image to perform the sharpening enhancement processing on the blurred face, thereby effectively reducing the difficulty degree of acquiring the face reference image, and the people in the generated face reference image and the people in the blurred image can be the same or different, thereby solving the condition constraint that the reference image is limited to the same target in the prior art, so that the image processing method can be applied to wider application scenes, and further improving the practicability of the method.

FIG. 9 is a schematic flow chart of a model training method according to an embodiment of the present invention; referring to fig. 9, the embodiment provides a model training method, and the execution subject of the method may be a model training apparatus, and it is understood that the model training apparatus may be implemented as software, or a combination of software and hardware. Specifically, the method may include:

step S801: acquiring a plurality of first images and a plurality of image codes corresponding to the plurality of first images, wherein the image codes are used for identifying information included in the first images;

step S802: processing the plurality of image codes by using a convolutional neural network to obtain a plurality of second images;

step S803: determining the definition of the first image, the definition of the second image and the similarity of the first image and the second image;

step S804: and when the definition of the second image is different from that of the first image and the similarity is greater than or equal to a preset threshold value, generating a machine learning model.

In some examples, the first image is a face image, and the similarity of the first image to the second image includes: similarity between the structure and appearance of the face in the first image and the structure and appearance of the face in the second image.

In some examples, the structure of the face includes at least one of: face orientation, pose, position information of the face relative to the image; the appearance of the human face includes at least one of: hair features, skin tone features, brightness features, color features.

The plurality of first images may be a plurality of preset blurred images, and the plurality of first images may include at least one of: image information obtained by photographing by a photographing device, image information in video information, a composite image, and the like. The embodiment does not limit the specific implementation manner of the training device for acquiring the plurality of first images, and a person skilled in the art may set the first images according to specific application requirements and design requirements, for example: the shooting device may be in communication connection with the training device, and after the shooting device shoots and obtains the first image, the training device may obtain the first image through the shooting device, specifically, the training device may actively obtain the first image obtained by the shooting device, or the shooting device may actively send the first image to the training device, so that the training device obtains the first image. Still alternatively, the first image may be stored in a preset area, and the training apparatus may obtain the first image by accessing the preset area.

After obtaining the plurality of first images, the plurality of first images may be analyzed to determine an image code corresponding to each first image, and specifically, a specific implementation manner of obtaining the image code corresponding to the first image is similar to that of the step in the embodiment shown in fig. 1, and specifically, the above statements may be referred to, and details are not repeated here.

After obtaining the plurality of image codes, the plurality of image codes may be processed by the convolutional neural network, so that a second image corresponding to the image codes may be obtained. Then, the definition of the first image, the definition of the second image, and the similarity between the first image and the second image may be determined, when the definition of the second image is higher than the definition of the first image and the similarity is greater than or equal to a preset threshold, a machine learning model may be generated based on the convolutional neural network, where the machine learning model is the second machine learning model and the clear face generation network in the above embodiment, and the machine learning model may determine a clear image for processing the blurred image based on the image coding information.

In the model training method provided by this embodiment, a plurality of first images and a plurality of image codes corresponding to the plurality of first images are obtained; processing the plurality of image codes by using a convolutional neural network to obtain a plurality of second images; determining the definition of the first image, the definition of the second image and the similarity of the first image and the second image; when the definition of the second image is different from that of the first image and the similarity is greater than or equal to a preset threshold value, a machine learning model is generated, the machine learning model can determine a reference image with the definition different from that of the image to be processed based on image coding information, and after the reference image is obtained, the image to be processed can be processed based on the reference image, so that the practicability of the model training method is improved.

Fig. 10 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention; referring to fig. 10, the present embodiment provides an image processing apparatus that can execute the image processing method shown in fig. 1, and specifically, the image processing apparatus may include:

the first acquisition module 11 is used for acquiring an image to be processed;

a first determining module 12, configured to determine image coding information corresponding to an image to be processed, where the image coding information is used to identify information included in the image to be processed;

a first determining module 12, configured to determine, based on the image coding information, at least one reference image for processing the image to be processed, where the at least one reference image satisfies a preset condition;

the first processing module 13 is configured to perform image processing on the image to be processed by using at least one reference image, so as to obtain a target image corresponding to the image to be processed.

In some examples, when the first determination module 12 determines image encoding information corresponding to an image to be processed, the first determination module 12 may be configured to perform: extracting image characteristic information of an image to be processed; the image feature information is determined as image encoding information corresponding to the image to be processed.

In some examples, the image characteristic information includes at least one of: shallow feature information, deep feature information.

In some examples, when the first determination module 12 extracts image feature information of an image to be processed, the first determination module 12 may be configured to perform: and processing the image to be processed by utilizing a first machine learning model to obtain image characteristic information corresponding to the image to be processed, wherein the first machine learning model is trained to be used for extracting the image characteristic information of the image.

In some examples, the preset condition includes at least one of: the definition of the at least one reference image is different from that of the image to be processed; the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold value.

In some examples, when the first determination module 12 determines at least one reference picture for processing the picture to be processed based on the picture coding information, the first determination module 12 may be configured to perform: and processing the image coding information by utilizing a second machine learning model to obtain at least one reference image for processing the image to be processed, wherein the second machine learning model is trained to be used for determining at least one reference image for processing the image to be processed based on the image coding information, the definition of the at least one reference image is different from that of the image to be processed, and the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold value.

In some examples, the image to be processed is a face image to be processed, and the similarity between the at least one reference image and the image to be processed includes: and the similarity between the structure and the appearance of the face in the at least one reference image and the structure and the appearance of the face in the image to be processed.

In some examples, when the first processing module 13 performs image processing on the image to be processed by using at least one reference image to obtain a target image corresponding to the image to be processed, the first processing module 13 may be configured to perform: and inputting the at least one reference image and the image to be processed into a third machine learning model to perform image processing on the image to be processed based on the at least one reference image by using the third machine learning model to obtain a target image corresponding to the image to be processed, wherein the third machine learning model is trained to perform image processing on the image to be processed based on the at least one reference image, and the definition of the at least one reference image is different from that of the image to be processed.

In some examples, when the first processing module 13 inputs the at least one reference image and the image to be processed to the third machine learning model, the first processing module 13 may be configured to perform: acquiring a fusion characteristic image corresponding to at least one reference image and an image to be processed; the fused feature image is input to a third machine learning model.

In some examples, when the first processing module 13 acquires a fused feature image corresponding to at least one of the reference image and the image to be processed, the first processing module 13 may be configured to perform: and splicing the at least one reference image and the image to be processed on the image channel to obtain a fusion characteristic image.

In some examples, when the first processing module 13 acquires a fused feature image corresponding to at least one of the reference image and the image to be processed, the first processing module 13 may be configured to perform: acquiring a first characteristic image corresponding to at least one reference image and a second characteristic image corresponding to an image to be processed; and carrying out fusion processing on the first characteristic image and the second characteristic image to obtain a fusion characteristic image.

In some examples, the sharpness of the target image is higher than the sharpness of the image to be processed; or the definition of the target image is lower than that of the image to be processed.

In some examples, the first obtaining module 11 and the first processing module 13 in this embodiment may be configured to perform the following steps:

a first obtaining module 11, configured to obtain a feature parameter for identifying a sharpness of an image to be processed;

the first processing module 13 is configured to determine confidence information with the target image based on the feature parameters and at least one reference image when the feature parameters meet preset conditions; and when the confidence coefficient information is smaller than a preset limit value, ignoring the target image.

In some examples, the characteristic parameter includes at least one of: noise parameters, ambiguity parameters, resolution parameters.

The apparatus shown in fig. 10 can perform the method of the embodiment shown in fig. 1-8, and the detailed description of this embodiment can refer to the related description of the embodiment shown in fig. 1-8. The implementation process and technical effect of the technical solution refer to the descriptions in the embodiments shown in fig. 1 to 8, and are not described herein again.

In one possible design, the structure of the image processing apparatus shown in fig. 10 may be implemented as an electronic device, which may be a mobile phone, a tablet computer, a server, or other devices. As shown in fig. 11, the electronic device may include: a first processor 21 and a first memory 22. Wherein the first memory 22 is used for storing a program for executing the image processing method provided in the above-mentioned embodiments shown in fig. 1-8, and the first processor 21 is configured for executing the program stored in the first memory 22.

The program comprises one or more computer instructions, wherein the one or more computer instructions, when executed by the first processor 21, are capable of performing the steps of:

acquiring an image to be processed;

determining image coding information corresponding to an image to be processed, wherein the image coding information is used for identifying information included in the image to be processed;

determining at least one reference image for processing an image to be processed based on image coding information, wherein the at least one reference image meets a preset condition;

and performing image processing on the image to be processed by utilizing at least one reference image to obtain a target image corresponding to the image to be processed.

Further, the first processor 21 is also used to execute all or part of the steps in the embodiments shown in fig. 1 to 8.

The electronic device may further include a first communication interface 23 for communicating with other devices or a communication network.

In addition, an embodiment of the present invention provides a computer storage medium for storing computer software instructions for an electronic device, which includes a program for executing the image processing method in the method embodiments shown in fig. 1 to 8.

FIG. 12 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention; referring to fig. 12, the present embodiment provides a model training apparatus, which may perform the model training method shown in fig. 9, and specifically, the model training apparatus may include:

a second obtaining module 31, configured to obtain a plurality of first images and a plurality of image codes corresponding to the plurality of first images, where the image codes are used to identify information included in the first images;

a second processing module 32, configured to process the multiple image codes by using a convolutional neural network, so as to obtain multiple second images;

a second determining module 33, configured to determine the definition of the first image, the definition of the second image, and the similarity between the first image and the second image;

and a second generating module 34, configured to generate the machine learning model when the definition of the second image is different from the definition of the first image and the similarity is greater than or equal to a preset threshold.

The apparatus shown in fig. 12 can execute the method of the embodiment shown in fig. 9, and reference may be made to the related description of the embodiment shown in fig. 9 for a part of this embodiment that is not described in detail. The implementation process and technical effect of the technical solution are described in the embodiment shown in fig. 9, and are not described herein again.

In one possible design, the structure of the model training apparatus shown in fig. 12 may be implemented as an electronic device, which may be a mobile phone, a tablet computer, a server, or other devices. As shown in fig. 13, the electronic device may include: a second processor 41 and a second memory 42. Wherein the second memory 42 is used for storing a program for the corresponding electronic device to execute the model training method provided in the embodiment shown in fig. 9, and the second processor 41 is configured to execute the program stored in the second memory 42.

The program comprises one or more computer instructions, wherein the one or more computer instructions, when executed by the second processor 41, are capable of performing the steps of:

Further, the second processor 41 is also used to execute all or part of the steps in the embodiment shown in fig. 9.

The electronic device may further include a second communication interface 43 for communicating with other devices or a communication network.

In addition, an embodiment of the present invention provides a computer storage medium for storing computer software instructions for an electronic device, which includes a program for executing the model training method in the method embodiment shown in fig. 9.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by adding a necessary general hardware platform, and of course, can also be implemented by a combination of hardware and software. With this understanding in mind, the above-described aspects and portions of the present technology which contribute substantially or in part to the prior art may be embodied in the form of a computer program product, which may be embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including without limitation disk storage, CD-ROM, optical storage, and the like.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An image processing method, comprising:

acquiring an image to be processed;

2. The method of claim 1, wherein determining image coding information corresponding to the image to be processed comprises:

extracting image characteristic information of the image to be processed;

and determining the image characteristic information as image coding information corresponding to the image to be processed.

3. The method of claim 2,

the image characteristic information includes at least one of: shallow feature information, deep feature information.

4. The method according to claim 2, wherein extracting image feature information of the image to be processed comprises:

and processing the image to be processed by utilizing a first machine learning model to obtain image characteristic information corresponding to the image to be processed, wherein the first machine learning model is trained to be used for extracting the image characteristic information of the image.

5. The method of claim 1, wherein the preset condition comprises at least one of:

the definition of the at least one reference image is different from that of the image to be processed;

the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold value.

6. The method of claim 1, wherein determining at least one reference picture for processing the picture to be processed based on the picture coding information comprises:

processing the image coding information by utilizing a second machine learning model to obtain at least one reference image for processing the image to be processed, wherein the second machine learning model is trained to determine at least one reference image for processing the image to be processed based on the image coding information, the definition of the at least one reference image is different from that of the image to be processed, and the similarity between the at least one reference image and the image to be processed is greater than or equal to a preset threshold value.

7. The method according to claim 6, wherein the image to be processed is a face image to be processed, and the similarity between the at least one reference image and the image to be processed comprises: and the similarity between the structure and the appearance of the face in the at least one reference image and the structure and the appearance of the face in the image to be processed.

8. The method of claim 7,

the structure of the face comprises at least one of the following: face orientation, pose, position information of the face relative to the image;

the appearance of the face comprises at least one of: hair features, skin tone features, brightness features, color features.

9. The method according to claim 1, wherein performing image processing on the image to be processed by using the at least one reference image to obtain a target image corresponding to the image to be processed comprises:

inputting the at least one reference image and the image to be processed into a third machine learning model, so as to perform image processing on the image to be processed based on the at least one reference image by using the third machine learning model, and obtaining a target image corresponding to the image to be processed, wherein the third machine learning model is trained for performing image processing on the image to be processed based on the at least one reference image, and the definition of the at least one reference image is different from that of the image to be processed.

10. The method of claim 9, wherein inputting the at least one reference image and the image to be processed to a third machine learning model comprises:

acquiring a fusion characteristic image corresponding to the at least one reference image and the image to be processed;

inputting the fused feature image to the third machine learning model.

11. The method of claim 10, wherein obtaining a fused feature image corresponding to the at least one reference image and the image to be processed comprises:

and splicing the at least one reference image and the image to be processed on an image channel to obtain the fusion characteristic image.

12. The method of claim 10, wherein obtaining a fused feature image corresponding to the at least one reference image and the image to be processed comprises:

acquiring a first characteristic image corresponding to the at least one reference image and a second characteristic image corresponding to the image to be processed;

and carrying out fusion processing on the first characteristic image and the second characteristic image to obtain the fusion characteristic image.

13. The method of claim 9,

the definition of the target image is higher than that of the image to be processed; or,

the definition of the target image is lower than that of the image to be processed.

14. The method according to any one of claims 1-13, further comprising:

acquiring a characteristic parameter for identifying the definition of the image to be processed;

when the characteristic parameters meet preset conditions, determining confidence information of the target image based on the characteristic parameters and the at least one reference image;

and when the confidence coefficient information is smaller than a preset limit value, ignoring the target image.

15. The method of claim 14,

the characteristic parameters include at least one of: noise parameters, ambiguity parameters, resolution parameters.

16. A method of model training, comprising:

17. The method of claim 16, wherein the first image is a face image, and wherein the similarity between the first image and the second image comprises: and the similarity between the structure and the appearance of the face in the first image and the structure and the appearance of the face in the second image.

18. The method of claim 17,

19. An image processing apparatus characterized by comprising:

the first acquisition module is used for acquiring an image to be processed;

a first determining module, configured to determine image coding information corresponding to the image to be processed, where the image coding information is used to identify information included in the image to be processed;

20. An electronic device, comprising: a memory, a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, implement the image processing method of any of claims 1-15.

21. A model training apparatus, comprising:

22. An electronic device, comprising: a memory, a processor; wherein the memory is to store one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, implement the model training method of any one of claims 16-18.