[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2021233215A1 - 图像处理方法及装置 - Google Patents

图像处理方法及装置 Download PDF

Info

Publication number
WO2021233215A1
WO2021233215A1 PCT/CN2021/093754 CN2021093754W WO2021233215A1 WO 2021233215 A1 WO2021233215 A1 WO 2021233215A1 CN 2021093754 W CN2021093754 W CN 2021093754W WO 2021233215 A1 WO2021233215 A1 WO 2021233215A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
shadow
target
sample
trained
Prior art date
Application number
PCT/CN2021/093754
Other languages
English (en)
French (fr)
Inventor
黄振
李巧
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Priority to EP21808679.1A priority Critical patent/EP4156081A4/en
Publication of WO2021233215A1 publication Critical patent/WO2021233215A1/zh
Priority to US17/987,987 priority patent/US20230076026A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/94Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/60Shadow generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/74Circuitry for compensating brightness variation in the scene by influencing the scene brightness using illuminating means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30176Document

Definitions

  • This application belongs to the field of communication technology, and specifically relates to an image processing method and device.
  • the captured image contains shadows of obstructions such as people or scenery.
  • shadows caused by obstructions in the final image because of the occlusion of the camera.
  • the purpose of the embodiments of the present application is to provide an image processing method and device that can remove shadows in an image.
  • an image processing method which includes:
  • the shadow in the target image is removed based on the target model to obtain a shadow-removed image
  • the target model is a model trained based on the target image sample including the shadow and the shadow image sample corresponding to the target image sample
  • the target image sample is based on the shooting
  • the shadow-free image sample and the sample generated by the preset simulated imaging conditions, and the shadow image sample is the sample determined based on the target image sample and the shadow-free image sample.
  • an image processing device which includes:
  • the target image acquisition module is used to acquire a target image including shadows
  • the shadow removal module is used to remove the shadow in the target image based on the target model to obtain a shadow-removed image, where the target model is a model trained based on the target image sample including the shadow and the shadow image sample corresponding to the target image sample.
  • the target The image sample is a sample generated based on the captured unshaded image sample and preset simulated imaging conditions, and the shadow image sample is a sample determined based on the target image sample and the unshaded image sample.
  • an embodiment of the present application provides an electronic device that includes a processor, a memory, and a program or instruction stored on the memory and capable of running on the processor.
  • the program or instruction is executed by the processor, the implementation is as follows: The steps of the method of the first aspect.
  • the embodiments of the present application provide a computer-readable storage medium, and the computer-readable storage medium stores a program or instruction, and when the program or instruction is executed by a processor, the steps of the method of the first aspect are implemented.
  • inventions of the present application provide a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run a program or an instruction to implement the steps of the method in the first aspect.
  • various real environments in reality can be simulated by using preset simulated imaging conditions, and the actual shadow-free image samples taken are real image data, so by using the actual shadow-free images
  • Samples and preset simulated imaging conditions generate target image samples.
  • Real data and simulated imaging conditions can be combined to quickly obtain target image samples with higher authenticity and corresponding target image samples for training the target model for removing image shadows.
  • the shadow image samples can improve the training accuracy and speed of the target model. Since the trained target model is trained based on the target image sample including the shadow and the shadow image sample corresponding to the target image sample, the target model obtained by training can be used to remove the shadow in the target image. The training speed and accuracy of the target model have been improved, so the shadows in the target image can be removed quickly and accurately.
  • FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present application
  • Fig. 2 is a schematic diagram of a target image including shadows provided by an embodiment of the present application
  • Fig. 3 is a schematic diagram of a process of constructing a target image sample provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of constructing target image samples and shadow image samples in the three-dimensional rendering engine provided by an embodiment of the present application;
  • FIG. 5 is a schematic diagram of a process of training a generation confrontation network provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another process of training a generation confrontation network provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of still another process of training a generation confrontation network provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of still another process of training a generation confrontation network provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of still another process of training a generation confrontation network provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an image after removing the shadow in FIG. 2 provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an image processing device provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the document picture taken is often blocked by obstructions such as the electronic device, so that when the document image is finally obtained, only part of the document image is bright, and the other part is Dark, and this part of the dark is because the occluder occludes the shadow caused by the light. But in some scenes, the shadows in the image will affect the imaging effect.
  • the embodiments of the present application provide an image processing method and device, which can remove shadows in an image.
  • the detailed description will be given below in conjunction with specific embodiments and drawings.
  • Fig. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • the execution subject of the image processing method may be an image processing device.
  • the image processing method includes step 110 and step 120.
  • Step 110 Obtain a target image including shadows.
  • step 120 the shadow in the target image is removed based on the target model to obtain a shadow-removed image.
  • the target model is a model trained based on target image samples including shadows and shadow image samples corresponding to the target image samples.
  • the target image sample is a sample generated based on the captured unshaded image sample and preset simulated imaging conditions; the shadow image sample is a sample determined based on the target image sample and the unshaded image sample.
  • various real environments in reality can be simulated by using preset simulated imaging conditions, and the actual shadowless image samples taken are real image data, so by using the actual shadowless images
  • Image samples and preset simulated imaging conditions generate target image samples.
  • Real data and simulated imaging conditions can be combined to quickly obtain target image samples and target image samples with higher authenticity for training the target model for removing image shadows
  • the corresponding shadow image sample can improve the training accuracy and speed of the target model. Since the trained target model is trained based on the target image sample including the shadow and the shadow image sample corresponding to the target image sample, the target model obtained by training can be used to remove the shadow in the target image. The training speed and accuracy of the target model have been improved, so the shadows in the target image can be removed quickly and accurately.
  • the target image may be an image including shadows actually taken by the user.
  • Fig. 2 is a schematic diagram of a target image including shadows provided by an embodiment of the present application. As shown in FIG. 2, the target image is an image of a document including a shadow.
  • step 120 The specific implementation of step 120 is described below.
  • the training samples include data such as images with shadows and images without shadows, and these training samples are expensive to obtain in real life.
  • target image samples can be generated based on the actual captured shadowless image samples and preset simulated imaging conditions.
  • a 3D rendering engine can be used to build the required simulation imaging scene, that is, to simulate the actual environment state, and then take snapshots or screenshots to obtain a large number of training samples.
  • the 3D rendering engine can be Maya modeling software, which is a 3D modeling and animation software.
  • the real imaging scene can be simulated through the 3D rendering engine.
  • Fig. 3 is a schematic diagram of the process of constructing a target image sample provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of constructing target image samples and shadow image samples in the three-dimensional rendering engine provided by the present application.
  • a basic environment is first created, and then a camera/imaging device, preset occlusions, light source, and atmospheric environment are added to the basic environment, and simulated imaging conditions such as an imaging plane are added.
  • Fig. 4 shows the preset simulated imaging conditions added in Fig. 3, including: preset camera/imaging device, preset simulated light source, preset obstruction, actual imaging object/imaging plane, preset atmospheric environment and other conditions.
  • Each preset simulation imaging condition is adjusted according to the actual data conditions, and arranged in a regular manner.
  • the actual shadowless image samples taken are added to the simulated imaging scene created in Figure 4.
  • the actual imaging object/imaging plane is added to the actual image without shadow image sample (denoted as Isource), that is, the actual imaging plane is filled with the actual image sample without shadow.
  • Isource the actual image without shadow image sample
  • the renderer in the Maya modeling software to render the simulated imaging scene created in it, and it will output a target image sample (denoted as Ishadow) with a shadow superimposed on the actual unshaded image sample. Then, by removing the actual shadow-free image samples from the target image samples, the shadow image samples (denoted as Imask) can be obtained.
  • Different training sample data can be obtained by adjusting the simulation imaging conditions and/or replacing the actual non-shadow image samples taken.
  • the simulated imaging conditions that can be adjusted may also include: the type of light source, the light intensity, the parameters of the atmospheric environment, and other parameters, which will not be listed here.
  • the target model may be a convolutional neural network that removes shadows in an image, a recurrent neural network model, and other different models. Input the target image into the target model, you can directly get the image after removing the shadow.
  • the trained target model can be used to determine the illumination parameters in the target image. Then use the illumination parameters to process the target image, which can remove the shadow in the target image.
  • the trained target model may be used to obtain the shadow image in the target image, and then the shadow image can be removed from the target image to obtain the shadow-removed image.
  • the trained target model in order to improve the accuracy of shadow removal, may be a trained generative confrontation network.
  • the specific implementation method of training the generation confrontation network will be introduced first.
  • FIG. 5 is a schematic diagram of a process of training a generation confrontation network provided by an embodiment of the present application. As shown in FIG. 5, the training and generation of the confrontation network includes steps 210 to 240.
  • Step 210 Obtain a plurality of target image samples and a shadow image sample corresponding to each target image sample.
  • Step 220 Input each target image sample to the first generator to be trained in the to-be-trained generation confrontation network to obtain a predicted shadow image corresponding to the target image sample.
  • Step 230 For each target image sample, based on the predicted shadow image corresponding to the target image sample and the first discriminator to be trained in the to-be-trained generation confrontation network, a first discrimination result is obtained, and based on the shadow image sample corresponding to the target image sample And the first discriminator to be trained to obtain the second discriminating result.
  • Step 240 Based on each first discrimination result and each second discrimination result, train the first generator to be trained and the first discriminator to be trained to obtain a trained generative confrontation network.
  • the target image sample is a sample generated based on a non-shadow image sample actually taken and a preset simulated imaging condition.
  • the shadow image sample is a sample determined based on the target image sample and the non-shadow image sample.
  • various real environments in reality can be simulated by using preset simulated imaging conditions, and the actual shadow-free image samples taken are real image data, so by using the actual shadow-free images Samples and preset simulated imaging conditions generate target image samples.
  • Real data and simulated imaging conditions can be combined to quickly obtain target image samples with high authenticity and shadow images corresponding to the target image samples for training and generating the confrontation network Samples, which can improve the training accuracy and speed of the generated confrontation network.
  • step 210 to step 240 The specific implementation manners of each step in step 210 to step 240 are respectively introduced in detail below.
  • the generative confrontation network mainly includes two parts: a generator and a discriminator.
  • the generator means that the network can autonomously output data such as images, texts or videos that are required by the network through a certain input, that is, it is used to generate data.
  • the discriminator is used to discriminate the data generated by the generator, and determine whether the data generated by the generator is close to reality.
  • step 220 The specific implementation of step 220 is described below.
  • a batch of training samples can be used, and the training samples include multiple target image samples and shadow image samples corresponding to each target image sample.
  • FIG. 6 is a schematic diagram of another process for training a generational confrontation network provided by an embodiment of the present application.
  • the target image sample is input to the first generator to be trained in the to-be-trained generation confrontation network to obtain the predicted shadow image corresponding to the target image sample.
  • the predicted shadow image corresponding to the target image sample is the image generated by the first generator to be trained.
  • step 230 The specific implementation of step 230 is described below.
  • step 230 includes: for each target image sample, input the predicted shadow image corresponding to the target image sample into the first discriminator to be trained in the to-be-trained generation confrontation network to obtain the first discriminant As a result, the shadow image sample corresponding to the target image sample is input to the first discriminator to be trained, and the second discrimination result is obtained.
  • the first discriminator to be trained directly discriminates the image output by the generator to be trained.
  • the predicted shadow image is input to the first discriminator to be trained in the to-be-trained generation confrontation network to obtain a first discrimination result
  • the shadow image samples are input to the first discriminator to be trained in the to-be-trained generation confrontation network to obtain the second discrimination result.
  • the first discrimination result includes the probability that the predicted shadow image is a true image and the probability that the predicted shadow image is a false image.
  • the second discrimination result includes the probability that the shadow image sample is a true image and the probability that the shadow image sample is a false image.
  • step 230 may include: for each target image sample, combining the target image sample with The predicted shadow image corresponding to the target image sample is channel-fused to obtain the first fused image, and the target image sample and the shadow image sample corresponding to the target image sample are channel-fused to obtain the second fused image; for each first fused image and each For the second fusion image, the first fusion image and the second fusion image are respectively input to the first discriminator to be trained to obtain the first discrimination result and the second discrimination result.
  • the first discrimination result includes the probability that the first fusion image is a true image and the probability that the first fusion image is a false image.
  • the second discrimination result includes the probability that the second fusion image is a true image and the probability that the second fusion image is a false image.
  • FIG. 7 is a schematic diagram of still another process of training a generational confrontation network provided by an embodiment of the present application.
  • the difference between the process of training and generating the confrontation network shown in FIG. 7 and the process of training and generating the confrontation network shown in FIG. 6 is that in FIG. 7, the target image sample and the predicted shadow image corresponding to the target image sample are passed
  • the first fusion image is obtained by fusion, and then the first fusion image is input to the first discriminator to be trained to obtain the first discrimination result.
  • the second fused image is obtained by channel fusion of the target image sample and the shadow image sample corresponding to the target image sample, and then the second fused image is input to the first discriminator to be trained to obtain the second discrimination result. That is, the input of the first discriminator to be trained is the first fused image and the second fused image, respectively.
  • the first fused image obtained by channel fusion of the target image sample and the predicted shadow image corresponding to the target image sample is input to the first discriminator to be trained, so that the first discriminator to be trained can acquire more Image feature information to improve the accuracy of the first discriminator to be trained.
  • the second fused image obtained by channel fusion of the target image sample and the shadow image sample corresponding to the target image sample is input to the first discriminator to be trained, so that the first discriminator to be trained can obtain more image feature information. In order to improve the accuracy of the first discriminator to be trained.
  • channel fusion of two images may be to splice the channels of two images with the same length and width together in turn to generate an image with a greater depth.
  • performing channel fusion of two images may also include adding or subtracting channels corresponding to the two images.
  • step 240 The specific implementation of step 240 is described below.
  • the first to-be-trained generative confrontation network includes a first to-be-trained generator and a first to-be-trained discriminator. As shown in FIG. 6 and FIG. 7, the first generator to be trained and the first discriminator to be trained can be trained based on each first discrimination result and each second discrimination result to obtain a trained generative confrontation network.
  • the first generator to be trained and the first discriminator to be trained may be alternately trained. That is, when training the first generator to be trained, the parameters of the first discriminator to be trained can be kept unchanged, and the parameters of the first generator to be trained can be adjusted. When training the first discriminator to be trained, the parameters of the first generator to be trained can be kept unchanged, and the parameters of the first discriminator to be trained can be adjusted.
  • the following first introduces the specific implementation of training the first discriminator to be trained.
  • the label of the predicted shadow image corresponding to each target image sample is set to 0, which means that the first discriminator to be trained hopes that the images output by the first generator to be trained are all fake images.
  • the label of the shadow image sample corresponding to the target image sample is set to 1.
  • the loss function value LOSS 1 can be calculated.
  • the loss function value LOSS 2 can be calculated.
  • the loss function value LOSS 1 and the loss function value LOSS 2 are added to obtain the loss function value LOSS D1 . Then, based on the loss function value LOSS D1 and the back propagation method, each parameter in the first discriminator to be trained is adjusted.
  • each parameter of the first generator to be trained does not change.
  • the label of the predicted shadow image corresponding to each target image sample is set to 1, which means that the first generator to be trained hopes that the images output by itself are all true images.
  • the loss function value LOSS G1 can be calculated. Then, based on the loss function value LOSS G1 and the back propagation method, each parameter in the first generator to be trained is adjusted.
  • the parameters of the first discriminator to be trained can be kept unchanged, and the parameters of the first generator to be trained can be adjusted.
  • the relative size relationship between LOSS D1 and LOSS G1 can be used to determine that more emphasis should be placed on training the first generator to be trained It is still the first discriminator to be trained.
  • the first generator to be trained and the first discriminator to be trained are trained alternately. If it is found that LOSS D1 is larger and LOSS G1 is smaller during the training process, it represents the first generator to be trained The performance of the generator is good, and the first discriminator to be trained can be trained 10 times, and then the first generator to be trained is trained once.
  • the back propagation algorithm may also be used to train the first generator to be trained and the discriminator to be trained at the same time.
  • the process of using the backpropagation algorithm to train the first generator to be trained and the first discriminator to be trained at the same time will not be repeated here.
  • training of the first discriminator to be trained and the discriminator to be trained may be stopped according to the first preset training stop condition.
  • the first preset training stop condition includes that both LOSS D1 and LOSS G1 are less than the first preset threshold.
  • the first preset training stop condition includes that the total number of training times of the first generator to be trained and the first discriminator to be trained reaches the preset training number threshold.
  • the accuracy of the first discriminator to be trained can be calculated according to each first discrimination result and each second discrimination result. If the number of training times is N, each training corresponds to the accuracy of the first discriminator to be trained.
  • the preset training stop condition also includes that the average value of the N accuracy rates is greater than the preset accuracy threshold. N is a positive integer.
  • the input of the first discriminator when the input of the first discriminator is the first fusion image and the second fusion image, it can be based on the label corresponding to each first fusion image, each first discrimination result, and each second fusion image.
  • the label corresponding to the fusion image and each second discrimination result are calculated, and LOSS D1 and LOSS G1 are calculated to train the to-be-trained generation adversarial network.
  • the specific training process is the same as that described above to predict the shadow image and the shadow image sample input into the first discriminator to be trained.
  • the training process in the scene is similar, so I won't repeat it here.
  • the pre-trained generative confrontation network only includes the trained first generator and the trained first discriminator, only the shadow image can be obtained by using the generative confrontation network. Shadow removal is performed on the target image, and the shadow image can be directly removed from the target image, and then the image after the shadow can be obtained.
  • a deep learning method can be used to obtain the image after removing the shadows. That is to say, the pre-trained generative confrontation network also includes the second generation And the second discriminator, so as to use the second generator to directly obtain the image after the shadow is removed.
  • step 240 includes step 2401 to step 2403.
  • step 2401 For each first fusion image, input the first fusion image to the second generator to be trained to obtain a predicted shadowless image.
  • step 2402 For each target image sample, a third discrimination result is obtained based on the predicted shadowless image corresponding to the target image sample and the second discriminator to be trained, and the third discrimination result is obtained based on the shadowless image sample corresponding to the target image sample and the second discriminant to be trained Detector, get the fourth discrimination result.
  • Step 2403 based on each first discrimination result and each second discrimination result, train the first generator to be trained and the first discriminator to be trained, and based on each third discrimination result and each fourth discrimination result , The second generator to be trained and the second discriminator to be trained are trained to obtain the trained generation confrontation network.
  • FIG. 8 is a schematic diagram of still another process of training a generational confrontation network provided by an embodiment of the present application.
  • the difference between the process of training and generating the confrontation network shown in FIG. 8 and the process of training and generating the confrontation network shown in FIG. 7 is that, in some embodiments, for each first fusion image, the first fusion image is input
  • the second generator to be trained obtains the predicted shadowless image. For each predicted shadowless image, input the predicted shadowless image into the second discriminator to be trained, and a third discrimination result can be obtained. For each unshaded image sample, input the unshaded image sample to the second discriminator to be trained, and the fourth discrimination result can be obtained. Then, based on the third discrimination result and the fourth discrimination result, the second generator to be trained and the second discriminator to be trained can be trained.
  • the process of training the second generator to be trained and the second discriminator to be trained is similar to the process of training the first generator to be trained and the first discriminator to be trained.
  • the second generator to be trained and the second discriminator to be trained may be trained alternately.
  • the following first introduces the specific implementation of training the second discriminator to be trained.
  • the label of each predicted shadowless image is set to 0, which means that the second discriminator to be trained hopes that the images output by the second generator to be trained are all fake images.
  • the loss function value LOSS 3 can be calculated.
  • the loss function value LOSS 4 can be calculated.
  • the loss function value LOSS 3 and the loss function value LOSS 4 are added to obtain the loss function value LOSS D2 . Then, based on the loss function value LOSS D2 and the back propagation method, each parameter in the second discriminator to be trained is adjusted.
  • each parameter of the second generator to be trained does not change.
  • the label of each predicted shadowless image is set to 1, which means that the second generator to be trained hopes that the images output by itself are all true images.
  • the loss function value LOSS G2 can be calculated. Then, based on the loss function value LOSS G2 and the back propagation method, each parameter in the second generator to be trained is adjusted. It should be noted that in the process of adjusting each parameter of the second generator to be trained, each parameter of the second discriminator to be trained does not change.
  • the relative size relationship between LOSS D2 and LOSS G2 can be used to determine that the second generator to be trained should be more emphasized. It is the second discriminator to be trained.
  • the back propagation algorithm may also be used to train the second generator to be trained and the discriminator to be trained at the same time.
  • the process of using the backpropagation algorithm to train the second generator to be trained and the second discriminator to be trained at the same time will not be repeated here.
  • training of the second discriminator to be trained and the discriminator to be trained may be stopped according to the second preset training stop condition.
  • the second preset training stop condition includes that both LOSS D2 and LOSS G2 are less than the second preset threshold.
  • the second preset training stop condition includes that the number of training times of the second discriminator to be trained and the second discriminator to be trained reaches the preset training frequency threshold.
  • the second discriminant generator to be trained and the discriminator to be trained different batches of training samples are used in each training process.
  • the accuracy of the second discriminator to be trained can be calculated according to each third discrimination result and each fourth discrimination result. If the number of training times is M, each training corresponds to the accuracy of the second discriminator to be trained.
  • the second preset training stop condition further includes that the average value of the M accuracy rates is greater than the preset accuracy rate threshold. M is a positive integer.
  • step 2403 includes: for each target image sample, the predicted shadowless image corresponding to the target image sample and the target image sample corresponding Channel fusion is performed on the first fused image to obtain a third fused image, and the shadowless image sample corresponding to the target image sample and the second fused image corresponding to the target image sample are channel fused to obtain a fourth fused image; The fusion image and each fourth fusion image are respectively input to the second discriminator to be trained to obtain the third discrimination result and the fourth discrimination result.
  • FIG. 9 is a schematic diagram of still another process of training a generation confrontation network provided by an embodiment of the present application.
  • the difference between the process of training and generating the confrontation network shown in FIG. 9 and the process of training and generating the confrontation network shown in FIG. Three fusion images, and then input the third fusion image to the second discriminator to be trained to obtain the third discrimination result.
  • the fourth fusion image is obtained by channel fusion of the second fusion image and the shadow-free image sample, and then the fourth fusion image is input to the second discriminator to be trained to obtain the fourth discrimination result.
  • the input of the second discriminator to be trained is the third fusion image and the fourth fusion image respectively.
  • the second discriminator to be trained and the second discriminator to be trained are trained based on the third discriminant result and the fourth discriminant result.
  • the input of the second discriminator to be trained is the third fusion image and the fourth fusion image
  • it can be based on the label corresponding to each third fusion image, each third discrimination result, and The label corresponding to each fourth fusion image and each fourth discrimination result are calculated, and LOSS D2 and LOSS G2 are calculated to train the generation of confrontation network to be trained.
  • the specific training process is the same as that described above to predict the unshaded image and the unshaded image samples and input the first Second, the training process in the scenario where the discriminator is to be trained is similar, and will not be repeated here.
  • both the generator and the discriminator need to be trained.
  • the pre-trained generative confrontation network is used to remove the shadow in the target image, only the pre- The trained generative confronts the generator in the network.
  • step 120 includes step 1201 and step 1202.
  • Step 1201 input the target image into the first generator of the generation confrontation network to obtain the shadow image corresponding to the target image;
  • step 1202 obtain the shadow-removed image based on the shadow image and the target image.
  • the first generator is obtained after the training of the first generator to be trained is completed.
  • the first discriminator is obtained.
  • the function of the first generator is to generate a corresponding shadow image according to the input target image including the shadow.
  • step 1202 includes removing the shadow image from the target image to obtain a shadow-removed image.
  • the time for training the generated confrontation network can be saved, so the generation efficiency of the generated confrontation network can be improved, and the target removal can be improved.
  • the efficiency of the shadows in the image if only the first generator and the first discriminator are included in the generated confrontation network, the time for training the generated confrontation network can be saved, so the generation efficiency of the generated confrontation network can be improved, and the target removal can be improved. The efficiency of the shadows in the image.
  • the trained generative confrontation network further includes a second generator and a second discriminator.
  • the second generator is obtained after the training of the second generator to be trained is completed.
  • the second discriminator is obtained.
  • the function of the second generator is to generate a shadow-free image based on the target fusion image obtained by fusing the input target image and the shadow image.
  • step 1202 includes: channel fusion of the shadow image and the target image to obtain the target fusion image; input the target fusion image to the second generator to obtain the image after the shadow is removed.
  • FIG. 10 is a schematic diagram of an image after removing the shadow in FIG. 2 provided by an embodiment of the present application.
  • Fig. 10 shows an image of a document without shading.
  • a deep learning method can be used to obtain a shadowless image with higher accuracy.
  • the training samples are obtained by using the 3D rendering engine, and then the training samples are used to train the generation confrontation network, so as to achieve the purpose of removing shadows in the image, and solve the pain points of constantly adjusting the posture when taking certain images. .
  • the 3D rendering engine can simulate various real environments in reality, and the generated data is extensive; at the same time, various imaging parameters in the 3D rendering engine can be freely adjusted, that is, the simulation imaging conditions , So the generated training sample data has diversity.
  • the authenticity and reliability of the generated training sample data can be improved, and rendering time and modeling difficulty can be reduced.
  • the training efficiency of the network can be improved by combining the 3D rendering engine with the training of the generative confrontation network, so as to quickly achieve the purpose of removing the shadows in the image and make the overall image clearer and cleaner.
  • the execution subject may be an image processing device, or a control module for executing the image processing method in the image processing device.
  • the image processing method executed by the image processing apparatus is taken as an example to illustrate the image processing method provided in the embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an image processing device provided by an embodiment of the present application. As shown in FIG. 11, the image processing device 300 includes:
  • the target image acquisition module 310 is used to acquire a target image including shadows.
  • the shadow removal module 320 is configured to remove the shadow in the target image based on the target model to obtain a shadow-removed image.
  • the target model is a model trained based on target image samples including shadows and shadow image samples corresponding to the target image samples.
  • the target image sample is a sample generated based on the captured unshaded image sample and preset simulated imaging conditions; the shadow image sample is a sample determined based on the target image sample and the unshaded image sample.
  • various real environments in reality can be simulated by using preset simulated imaging conditions, and the actual shadow-free image samples taken are real image data, so by using the actual shadow-free images
  • Samples and preset simulated imaging conditions generate target image samples.
  • Real data and simulated imaging conditions can be combined to quickly obtain target image samples with higher authenticity and corresponding target image samples for training the target model for removing image shadows.
  • the shadow image samples can improve the training accuracy and speed of the target model. Since the trained target model is trained based on the target image sample including the shadow and the shadow image sample corresponding to the target image sample, the target model obtained by training can be used to remove the shadow in the target image. The training speed and accuracy of the target model have been improved, so the shadows in the target image can be removed quickly and accurately.
  • the target model includes a generative adversarial network.
  • the shadow removal module 320 includes:
  • the shadow image determination unit is used to input the target image into the first generator of the generation confrontation network to obtain the shadow image corresponding to the target image.
  • the shadow removal unit is used to obtain a shadow-removed image based on the shadow image and the target image.
  • the generation of the confrontation network further includes a second generator
  • the shadow removal unit is used to:
  • the target fusion image is input to the second generator to obtain the image after the shadow is removed.
  • the shadow removal unit in order to improve the efficiency of removing shadows in the image, is used to:
  • the image processing device in the embodiment of the present application may be a device, or a component, integrated circuit, or chip in the device.
  • the device can be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant). assistant, PDA), etc.
  • Non-mobile electronic devices can be servers, network attached storage (NAS), personal computers (PC), televisions (television, TV), teller machines or self-service machines, etc., this application The embodiments are not specifically limited.
  • the image processing device in the embodiment of the present application may be a device with an operating system.
  • the operating system may be an Android operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiment of the present application.
  • the image processing apparatus provided in the embodiment of the present application can implement each process in the method embodiment in FIG.
  • an embodiment of the present application further provides an electronic device, including a processor, a memory, and a program or instruction stored in the memory and capable of running on the processor.
  • the program or instruction implements the above-mentioned image processing when the program or instruction is executed by the processor.
  • Each process of the method embodiment can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 12 is a schematic diagram of the hardware structure of an electronic device that implements an embodiment of the present application.
  • the electronic device 400 includes but is not limited to: a radio frequency unit 401, a network module 402, an audio output unit 403, an input unit 404, a sensor 405, a display unit 406, a user input unit 407, an interface unit 408, a memory 409, a processor 410, etc. part.
  • the electronic device 400 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the processor 410 through a power management system, so that the power management system can manage charging, discharging, and power management. Consumption management and other functions.
  • the structure of the electronic device shown in FIG. 12 does not constitute a limitation on the electronic device.
  • the electronic device may include more or fewer components than those shown in the figure, or some components may be combined, or different component arrangements, which will not be repeated here. .
  • Processor 410 used to obtain a target image including shadows; remove the shadows in the target image based on the target model to obtain a shadow-removed image, where the target model is based on the target image sample including the shadow and the target image sample corresponding
  • the model obtained by the training of the shadow image sample the target image sample is a sample generated based on the captured unshaded image sample and preset simulation imaging conditions, and the shadow image sample is a sample determined based on the target image sample and the unshaded image sample.
  • various real environments in reality can be simulated by using preset simulated imaging conditions, and the actual shadow-free image samples taken are real image data, so by using the actual shadow-free images
  • Samples and preset simulated imaging conditions generate target image samples.
  • Real data and simulated imaging conditions can be combined to quickly obtain target image samples with higher authenticity and corresponding target image samples for training the target model for removing image shadows.
  • the shadow image samples can improve the training accuracy and speed of the target model. Since the trained target model is trained based on the target image sample including the shadow and the shadow image sample corresponding to the target image sample, the target model obtained by training can be used to remove the shadow in the target image. The training speed and accuracy of the target model have been improved, so the shadows in the target image can be removed quickly and accurately.
  • the processor 410 is further configured to: input the target image into the first generator of the generation confrontation network to obtain the shadow image corresponding to the target image; and obtain the shadow-removed image based on the shadow image and the target image.
  • the shadow in the target image can be removed.
  • the generated confrontation network further includes a second generator
  • the processor 410 is further configured to: channel fusion the shadow image and the target image to obtain the target fusion image; input the target fusion image to the second generator to obtain the shadow removal After the image.
  • the time for training the generated confrontation network can be saved, so the generation efficiency of the generated confrontation network can be improved, and the removal The efficiency of the shadow in the target image.
  • a deep learning method can be used to obtain a shadowless image with higher accuracy.
  • the processor 410 is further configured to remove the shadow image from the target image to obtain a shadow-removed image.
  • the time for training the generated confrontation network can be saved, so the generation efficiency of the generated confrontation network can be improved, and the target removal can be improved.
  • the efficiency of the shadows in the image if only the first generator and the first discriminator are included in the generated confrontation network, the time for training the generated confrontation network can be saved, so the generation efficiency of the generated confrontation network can be improved, and the target removal can be improved. The efficiency of the shadows in the image.
  • the embodiments of the present application also provide a computer-readable storage medium with a program or instruction stored on the computer-readable storage medium.
  • the program or instruction is executed by a processor, each process of the above-mentioned image processing method embodiment is realized, and the same can be achieved. In order to avoid repetition, I won’t repeat them here.
  • the processor is the processor in the electronic device in the foregoing embodiment.
  • Examples of computer-readable storage media include non-transitory computer-readable storage media, such as computer read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks, or optical disks.
  • the embodiment of the present invention also provides an electronic device, which is configured to perform each process of the foregoing image processing method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the embodiment of the present application also provides a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface and the processor are coupled. The technical effect, in order to avoid repetition, will not be repeated here.
  • chips mentioned in the embodiments of the present application may also be referred to as system-level chips, system-on-chips, system-on-chips, or system-on-chips.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to make a terminal (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.
  • a terminal which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Signal Processing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Graphics (AREA)
  • Image Analysis (AREA)
  • Geometry (AREA)

Abstract

一种图像处理方法及装置,属于通信技术领域。该方法包括:获取包括阴影的目标图像;基于目标模型去除目标图像中的阴影,得到去除阴影后的图像,其中,目标模型为基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本训练得到的模型,目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本,阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。

Description

图像处理方法及装置
相关申请的交叉引用
本申请主张在2020年05月21日在中国提交的中国专利申请号202010435861.8的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于通信技术领域,具体涉及一种图像处理方法及装置。
背景技术
当用户在使用相机设备进行拍照时,常常因光照、拍照角度等因素的限制从而使得拍摄的图像中包含人物或者景物等遮挡物的阴影。例如,用户在拍摄文档时,常常会因为相机的遮挡而在最终拍摄的图像中存在遮挡物产生的阴影。
在实现本申请过程中,申请人发现相关技术中至少存在如下问题:若拍摄的图像中存在遮挡物产生的阴影,则会使得成像的效果不够清晰,影响用户的拍摄体验。因此,为了提高用户的拍摄体验,亟需一种图像处理方法,以去除图像中的阴影。
发明内容
本申请实施例的目的是提供一种图像处理方法及装置,能够去除图像中的阴影。
为了解决上述技术问题,本申请是这样实现的:
第一方面,本申请实施例提供了一种图像处理方法,该方法包括:
获取包括阴影的目标图像;
基于目标模型去除目标图像中的阴影,得到去除阴影后的图像,其中,目标模型为基于包括阴影的目标图像样本和目标图像样本对应的阴影 图像样本训练得到的模型,目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本,阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。
第二方面,本申请实施例提供了一种图像处理装置,该装置包括:
目标图像获取模块,用于获取包括阴影的目标图像;
阴影去除模块,用于基于目标模型去除目标图像中的阴影,得到去除阴影后的图像,其中,目标模型为基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本训练得到的模型,目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本,阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器、存储器及存储在存储器上并可在处理器上运行的程序或指令,程序或指令被处理器执行时实现如第一方面的方法的步骤。
第四方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质上存储程序或指令,程序或指令被处理器执行时实现如第一方面的方法的步骤。
第五方面,本申请实施例提供了一种芯片,芯片包括处理器和通信接口,通信接口和处理器耦合,处理器用于运行程序或指令,实现如第一方面的方法的步骤。
在本申请实施例中,利用预设的仿真成像条件可以模拟现实中的各种各样的真实环境,而实际拍摄的无阴影图像样本是真实的图像数据,因此通过利用实际拍摄的无阴影图像样本和预设的仿真成像条件生成目标图像样本,可以将真实数据和仿真成像条件相结合,以快速得到用于训练去除图像阴影的目标模型的真实性较高的目标图像样本和目标图像样本对应的阴影图像样本,从而可以提高该目标模型的训练精度和速度。由于训练后的目标模型是基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本进行训练得到的,因此,通过利用训练得到的目标模型可以实现对目标图像中的阴影的去除,而由于目标模型的训练速度和精度都得到了提高,因此可以实现对目标图像中的阴影进行快速和准确的去除。
附图说明
图1是本申请实施例提供的图像处理方法的流程示意图;
图2是本申请实施例提供的包括阴影的目标图像的示意图;
图3是本申请实施例提供的构建目标图像样本的流程示意图;
图4是本申请实施例提供的三维渲染引擎中构建目标图像样本和阴影图像样本的示意图;
图5是本申请实施例提供的训练生成对抗网络的一种流程示意图;
图6是本申请实施例提供的训练生成对抗网络的另一种流程示意图;
图7是本申请实施例提供的训练生成对抗网络的再一种流程示意图;
图8是本申请实施例提供的训练生成对抗网络的再一种流程示意图;
图9是本申请实施例提供的训练生成对抗网络的再一种流程示意图;
图10是本申请实施例提供的去除图2中的阴影之后的图像的示意图;
图11是本申请实施例提供的图像处理装置的结构示意图;
图12是本申请实施例提供的电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,说明书以及权利要求中“和/或”表示 所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的图像处理方法及装置进行详细地说明。
在一些应用场景,用户在用电子设备的摄像组件拍文档的过程中,拍摄的文档图片经常会被电子设备等遮挡物挡住,从而在最后获得文档图像的时候,只有一部分是亮的,一部分是暗的,而这部分暗的是由于遮挡物遮挡了光照产生的阴影。但是在一些场景下,图像中的阴影会影响成像的效果。
基于此,本申请实施例提供一种图像处理方法及装置,能够实现去除图像中的阴影。下面结合具体的实施例和附图进行详细介绍。
图1是本申请实施例提供的图像处理方法的流程示意图。该图像处理方法的执行主体可以是图像处理装置。如图1所示,该图像处理方法包括步骤110和步骤120。
步骤110,获取包括阴影的目标图像。
步骤120,基于目标模型去除目标图像中的阴影,得到去除阴影后的图像。
其中,目标模型为基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本训练得到的模型。
目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本;阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。
在本申请的实施例中,利用预设的仿真成像条件可以模拟现实中的各种各样的真实环境,而实际拍摄的无阴影图像样本是真实的图像数据,因此通过利用实际拍摄的无阴影图像样本和预设的仿真成像条件生成目标图像样本,可以将真实数据和仿真成像条件相结合,以快速得到用于训练去 除图像阴影的目标模型的真实性较高的目标图像样本和目标图像样本对应的阴影图像样本,从而可以提高该目标模型的训练精度和速度。由于训练后的目标模型是基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本进行训练得到的,因此,通过利用训练得到的目标模型可以实现对目标图像中的阴影的去除,而由于目标模型的训练速度和精度都得到了提高,因此可以实现对目标图像中的阴影进行快速和准确的去除。
在步骤110中,目标图像可以为用户真实拍摄的包括阴影的图像。图2是本申请实施例提供的包括阴影的目标图像的示意图。如图2所示,目标图像为包括阴影的文档的图像。
下面介绍步骤120的具体实现方式。
在本申请的实施例中,在训练生成目标模型时,需要大量的训练样本,训练样本包括含有阴影的图像、不含有阴影的图像等数据,而这些训练样本在实际生活中获取成本很高。
因此,为了降低训练样本的获取成本,以提高训练样本的获取效率,从而提高目标模型的训练效率,可以基于实际拍摄的无阴影图像样本和预设的仿真成像条件生成目标图像样本。
作为一个示例,可以利用三维渲染引擎搭建需要的仿真成像场景,即模拟实际的环境状态,再通过快照或截屏的方式,以获取大量的训练样本。
作为一个示例,三维渲染引擎可以为Maya建模软件,Maya建模软件是一个三维建模和动画软件。通过三维渲染引擎可以模拟真实的成像场景。图3是本申请实施例提供的构建目标图像样本的流程示意图。图4是本申请提供的三维渲染引擎中构建目标图像样本和阴影图像样本的示意图。如图3所示,在三维渲染引擎中首先创建基础环境,然后在基础环境中添加相机/成像设备、预设遮挡物、光源和大气环境以及添加了成像平面等仿真成像条件。需要说明的是,不同的仿真成像条件的添加顺序不做限制,图3中仅是示例。图4中示出图3中添加的预设的仿真成像条件,包括:预 设相机/成像设备、预设模拟光源、预设遮挡物、实际成像物体/成像平面和预设大气环境等条件。
将每个预设仿真成像条件,按照实际需要的数据条件进行调整,并有规则排列。为了提高训练样本的真实性,如图3所示,将实际拍摄的无阴影图像样本加入到图4中创建的仿真成像场景中。在实际成像物体/成像平面中加入实际拍摄的无阴影图像样本(记为Isource),即用实际拍摄的无阴影图像样本填充实际成像平面。在Maya建模软件中启动渲染器对中创建的仿真成像场景进行渲染,则会输出在实际拍摄的无阴影图像样本中叠加了阴影的目标图像样本(记为Ishadow)。然后,将目标图像样本中的实际拍摄的无阴影图像样本去除,则可以得到阴影图像样本(记为Imask)。
为了生成大量有差异的训练样本数据,可以随机修改拍摄相机的影响因素、例如相机焦距、拍摄视野等参数,还可以随机改变遮挡物的成像程度,例如遮挡物大小、遮挡物的放置角度等参数,或者随机修改成像平面的影响因素,例如成像平面的角度或成像平面与光源、拍摄相机、遮挡物等不同物体之间的距离等参数。
通过调整仿真成像条件和/或更换实际拍摄的无阴影图像样本,可以得到不同的训练样本数据。可以调整的仿真成像条件还可以包括:光源的类型、光照强度、大气环境的参数等参数,在此不再一一列举。
在一些实施例中,目标模型可以为具有去除图像中阴影的卷积神经网络、循环神经网络模型等不同的模型。将目标图像输入目标模型,即可直接得到去除阴影后的图像。
在另一些实施例中,利用训练后的目标模型可以确定目标图像中的光照参数。然后利用该光照参数对目标图像进行处理,可以去除目标图像中的阴影。
在又一些实施例中,可以利用训练后的目标模型得到目标图像中的阴影图像,然后从目标图像中去除阴影图像,即可得到去除阴影后的图像。
在本申请一些实施例中,为了提高阴影的去除精度,训练后的目标模型可以为训练后的生成对抗网络。为了便于介绍利用训练后的生成对抗网络去除目标图像中的阴影的具体实现方式,因此下面先介绍训练生成对抗网络的具体实现方式。
图5是本申请实施例提供的训练生成对抗网络的一种流程示意图。如图5所示,训练生成对抗网络包括步骤210~步骤240。
步骤210,获取多个目标图像样本和每个目标图像样本对应的阴影图像样本。
步骤220,将每个目标图像样本输入到待训练生成对抗网络中的第一待训练生成器,得到目标图像样本对应的预测阴影图像。
步骤230,对于每个目标图像样本,基于目标图像样本对应的预测阴影图像以及待训练生成对抗网络中的第一待训练判别器,得到第一判别结果,并基于目标图像样本对应的阴影图像样本以及第一待训练判别器,得到第二判别结果。
步骤240,基于每个第一判别结果和每个第二判别结果,对第一待训练生成器和第一待训练判别器进行训练,得到训练后的生成对抗网络。
其中,目标图像样本为基于实际拍摄的无阴影图像样本和预设的仿真成像条件生成的样本。阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。
在本申请实施例中,利用预设的仿真成像条件可以模拟现实中的各种各样的真实环境,而实际拍摄的无阴影图像样本是真实的图像数据,因此通过利用实际拍摄的无阴影图像样本和预设的仿真成像条件生成目标图像样本,可以将真实数据和仿真成像条件相结合,以快速得到用于训练生成 对抗网络的真实性较高的目标图像样本和目标图像样本对应的阴影图像样本,从而可以提高生成对抗网络的训练精度和速度。
下面分别对步骤210~步骤240中的每个步骤的具体实现方式进行详细介绍。
在本申请的实施例中,生成对抗网络主要包括两部分:生成器和判别器。生成器是指能够通过一定的输入由网络自主地输出指定需要的图像、文字或视频等数据,即用于生成数据。判别器用于对生成器生成的数据进行判别,判别生成器生成的数据是否接近于真实。
下面介绍步骤220的具体实现方式。
在对待训练生成对抗网络的一次训练的过程中,可以利用一批训练样本,该训练样本包括多个目标图像样本和每个目标图像样本对应的阴影图像样本。
作为一个示例,图6是本申请实施例提供的训练生成对抗网络的另一种流程示意图。对于每个目标图像样本,将该目标图像样本输入待训练生成对抗网络中的第一待训练生成器,得到目标图像样本对应的预测阴影图像。目标图像样本对应的预测阴影图像即是第一待训练生成器生成的图像。
下面介绍步骤230的具体实现方式。
在一些实施例中,为了提高训练速度,步骤230包括:对于每个目标图像样本,将目标图像样本对应的预测阴影图像输入待训练生成对抗网络中的第一待训练判别器,得到第一判别结果,并将该目标图像样本对应的阴影图像样本输入到第一待训练判别器,得到第二判别结果。
也就是说,第一待训练判别器直接对第一待训练生成器输出的图像进行判别。将预测阴影图像输入待训练生成对抗网络中的第一待训练判别器,得到第一判别结果,将阴影图像样本输入待训练生成对抗网络中的第一待训练判别器,得到第二判别结果。
作为一个示例,第一判别结果包括预测阴影图像为真图的概率和预测阴影图像为假图的概率。第二判别结果包括阴影图像样本为真图的概率和阴影图像样本为假图的概率。
在另一些实施例中,为了增加第一待训练判别器判别的精度,则该判别器需要获取更多的特征信息,因此,步骤230可以包括:对于每个目标图像样本,将目标图像样本和目标图像样本对应的预测阴影图像进行通道融合得到第一融合图像,并将目标图像样本和目标图像样本对应的阴影图像样本进行通道融合得到第二融合图像;对于每个第一融合图像和每个第二融合图像,将第一融合图像和第二融合图像分别输入到第一待训练判别器,得到第一判别结果和第二判别结果。
其中,第一判别结果包括第一融合图像为真图的概率和第一融合图像为假图的概率。第二判别结果包括第二融合图像为真图的概率和第二融合图像为假图的概率。
图7是本申请实施例提供的训练生成对抗网络的再一种流程示意图。图7所示的训练生成对抗网络的流程与图6所示的训练生成对抗网络的流程的不同之处在于,在图7中,通过将目标图像样本和目标图像样本对应的预测阴影图像进行通道融合得到第一融合图像,然后将第一融合图像输入第一待训练判别器,得到第一判别结果。通过将目标图像样本和目标图像样本对应的阴影图像样本进行通道融合得到第二融合图像,然后将第二融合图像输入第一待训练判别器,得到第二判别结果。也就是说,第一待训练判别器的输入分别是第一融合图像和第二融合图像。
在本申请的实施例中,通过将目标图像样本和目标图像样本对应的预测阴影图像进行通道融合得到的第一融合图像输入第一待训练判别器,可以使第一待训练判别器获取更多的图像特征信息,以提高第一待训练判别器判别的准确度。类似地,通过将目标图像样本和目标图像样本对应的阴影图像样本进行通道融合得到的第二融合图像输入第一待训练判别器,可 以使第一待训练判别器获取更多的图像特征信息,以提高第一待训练判别器判别的准确度。
作为一个示例,两个图像进行通道融合可以为将长宽相同的两个图像的各个通道依次拼接在一起,生成一个深度更大的图像。在另一个示例中,两个图像进行通道融合还可以包括将两个图像对应的通道分别进行相加或相减等处理。
下面介绍步骤240的具体实现方式。
在一些实施例中,第一待训练生成对抗网络包括第一待训练生成器和第一待训练判别器。如图6和图7所示,可以基于每个第一判别结果和每个第二判别结果,对第一待训练生成器和第一待训练判别器进行训练,得到训练后的生成对抗网络。
需要说明的是,在一些实施例中,在对第一待训练生成器和第一待训练判别器进行训练的过程中,可以对第一待训练生成器和第一待训练判别器进行交替训练。也就是说,在训练第一待训练生成器时,可以保持第一待训练判别器的参数不变,调整第一待训练生成器的参数。在训练第一待训练判别器时,可以保持第一待训练生成器的参数不变,调整第一待训练判别器的参数。
下面首先对训练第一待训练判别器的具体实现方式进行介绍。
作为一个示例,将每个目标图像样本对应的预测阴影图像的标签设置为0,代表第一待训练判别器希望第一待训练生成器输出的图像均是假图。目标图像样本对应的阴影图像样本的标签设置为1。
基于每个预测阴影图像对应的第一判别结果和每个预测阴影图像的标签0,以及预设损失函数,可以计算损失函数值LOSS 1。基于每个阴影图像样本对应的第二判别结果和阴影图像样本的标签1,以及预设的损失函数,可以计算损失函数值LOSS 2
然后,将损失函数值LOSS 1和损失函数值LOSS 2相加可以得到损失函数值LOSS D1。然后,基于损失函数值LOSS D1以及反向传播法,调整第一待训练判别器中的每个参数。
需要说明的是,在调整第一待训练判别器的每个参数的过程中,第一待训练生成器的每个参数不发生变化。
下面接着对训练第一待训练生成器的具体实现方式进行介绍。
作为一个示例,将每个目标图像样本对应的预测阴影图像的标签均设置为1,代表第一待训练生成器希望自身输出的图像均是真图。
基于每个预测阴影图像对应的第一判别结果和每个预测阴影图像的标签1,以及预设损失函数,可以计算损失函数值LOSS G1。然后,基于损失函数值LOSS G1以及反向传播法,调整第一待训练生成器中的每个参数。在训练第一待训练生成器时,可以保持第一待训练判别器的参数不变,调整第一待训练生成器的参数。
需要说明的是,在训练第一待训练生成器和第一待训练判别器的过程中,可以根据LOSS D1和LOSS G1之间的相对大小关系,来确定更应该着重训练第一待训练生成器还是第一待训练判别器。作为一个示例,在训练初期,第一待训练生成器和第一待训练判别器交替一次进行训练,若在训练过程中发现LOSS D1较大,而LOSS G1较小,则代表第一待训练生成器的性能较好,可以训练10次第一待训练判别器,然后训练1次第一待训练生成器。
在另一些实施例中,也可以利用反向传播算法同时对第一待训练生成器和第一待训练判别器进行训练。利用反向传播算法同时对第一待训练生成器和第一待训练判别器进行训练的过程,在此不做赘述,具体可参考上述利用反向传播算法分别对第一待训练生成器和第一待训练判别器进行训练的过程。
在本申请的一些实施例中,可以根据第一预设训练停止条件,停止训练第一待训练判别生成器和第一待训练判别器。
作为一个示例,第一预设训练停止条件包括LOSS D1和LOSS G1均小于第一预设阈值。
作为另一个示例,第一预设训练停止条件包括第一待训练生成器和第一待训练判别器的总训练次数达到预设训练次数阈值。
需要说明的是,在对第一待训练生成器和第一待训练判别器训练时,每次训练过程中运用的都是不同批的训练样本。对于一批训练样本,可以根据每个第一判别结果和每个第二判别结果来统计第一待训练判别器的准确率。若训练次数为N次,则每次训练均对应一个第一待训练判别器的准确率。其中,预设训练停止条件还包括N个准确率的平均值大于预设准确率阈值。N为正整数。
在另一些实施例中,当第一判别器的输入为第一融合图像和第二融合图像时,则可以根据每个第一融合图像对应的标签、每个第一判别结果以及每个第二融合图像对应的标签和每个第二判别结果,计算LOSS D1和LOSS G1,以训练待训练生成对抗网络,具体训练过程与上述以预测阴影图像和阴影图像样本分别输入第一待训练判别器的场景下的训练过程相类似,在此不再赘述。
在本申请的一些实施例中,若预先训练的生成对抗网络只包括训练好的第一生成器和训练好的第一判别器,则利用生成对抗网络只可以得到阴影图像,为了对包含阴影的目标图像进行阴影的去除,可以从目标图像中直接去除阴影图像,则可以得到去除阴影后的图像。
但是,在另一些实施例中,为了提高对目标图像中阴影去除的准确度,可以用深度学习的方法得到去除阴影后的图像,也就是说,预设训练的生成对抗网络还包括第二生成器和第二判别器,以利用第二生成器直接得到去除阴影后的图像。
若预先训练的生成对抗网络还包括第二生成器和第二判别器,则待训练生成对抗网络还包括第二待训练生成器和第二待训练判别器。在此场景下,步骤240包括步骤2401~步骤2403。步骤2401,对于每个第一融合图像,将第一融合图像输入到第二待训练生成器,得到预测无阴影图像。步骤2402,对于每个目标图像样本,基于目标图像样本对应的预测无阴影图像以及第二待训练判别器得到第三判别结果,并基于目标图像样本对应的无阴影图像样本以及第二待训练判别器,得到第四判别结果。步骤2403,基于每个第一判别结果和每个第二判别结果,对第一待训练生成器和第一待训练判别器进行训练,并基于每个第三判别结果和每个第四判别结果,对第二待训练生成器和第二待训练判别器进行训练,得到训练后的生成对抗网络。
图8是本申请实施例提供的训练生成对抗网络的再一种流程示意图。图8所示的训练生成对抗网络的流程与图7所示的训练生成对抗网络的流程的不同之处在于,在一些实施例中,对于每个第一融合图像,将该第一融合图像输入第二待训练生成器,得到预测无阴影图像。对于每个预测无阴影图像,将该预测无阴影图像输入第二待训练判别器中,可以得到第三判别结果。对于每个无阴影图像样本,将该无阴影图像样本输入到第二待训练判别器,可以得到第四判别结果。然后基于第三判别结果和第四判别结果可以训练第二待训练生成器和第二待训练判别器。
需要说明的是,在本申请的实施例中,训练第二待训练生成器和第二待训练判别器的过程与上述训练第一待训练生成器和第一待训练判别器的过程相类似。
在一些实施例中,第二待训练生成器和第二待训练判别器可以交替训练。下面首先对训练第二待训练判别器的具体实现方式进行介绍。
作为一个示例,将每个预测无阴影图像的标签均设置为0,代表第二待训练判别器希望第二待训练生成器输出的图像均是假图。
基于每个预测无阴影图像对应的第三判别结果和每个预测无阴影图像的标签0,以及预设损失函数,可以计算损失函数值LOSS 3。基于每个无阴影图像样本对应的第四判别结果和无阴影图像样本的标签1,以及预设的损失函数,可以计算损失函数值LOSS 4
然后,将损失函数值LOSS 3和损失函数值LOSS 4相加可以得到损失函数值LOSS D2。然后,基于损失函数值LOSS D2以及反向传播法,调整第二待训练判别器中的每个参数。
需要说明的是,在调整第二待训练判别器的每个参数的过程中,第二待训练生成器的每个参数不发生变化。
下面接着对训练第二待训练生成器的具体实现方式进行介绍。
作为一个示例,将每个预测无阴影图像的标签设置为1,代表第二待训练生成器希望自身输出的图像均是真图。基于每个预测无阴影图像对应的第三判别结果和每个预测无阴影图像的标签1,以及预设损失函数,可以计算损失函数值LOSS G2。然后,基于损失函数值LOSS G2以及反向传播法,调整第二待训练生成器中的每个参数。需要说明的是,在调整第二待训练生成器的每个参数的过程中,第二待训练判别器的每个参数不发生变化。
需要说明的是,在训练第二待训练生成器和第二待训练判别器的过程中,可以根据LOSS D2和LOSS G2之间的相对大小关系,来确定更应该着重训练第二待训练生成器还是第二待训练判别器。
在另一些实施例中,也可以利用反向传播算法同时对第二待训练生成器和第二待训练判别器进行训练。利用反向传播算法同时对第二待训练生成器和第二待训练判别器进行训练的过程,在此不做赘述,具体可参考上述利用反向传播算法分别对第二待训练生成器和第二待训练判别器进行训练的过程。
在本申请的一些实施例中,可以根据第二预设训练停止条件,停止训练第二待训练判别生成器和第二待训练判别器。
作为一个示例,第二预设训练停止条件包括LOSS D2和LOSS G2均小于第二预设阈值。
作为另一个示例,第二预设训练停止条件包括第二待训练判别生成器和第二待训练判别器的训练次数达到预设训练次数阈值。
需要说明的是,在对第二待训练判别生成器和第二待训练判别器训练时,每次训练过程中运用的都是不同批的训练样本。对于一批训练样本,可以根据每个第三判别结果和每个第四判别结果来统计第二待训练判别器的准确率。若训练次数为M次,则每次训练均对应一个第二待训练判别器的准确率。其中,第二预设训练停止条件还包括M个准确率的平均值大于预设准确率阈值。M为正整数。
在本申请的另一些实施例中,为了提高第二待训练判别器的判别准确度,步骤2403包括:对于每个目标图像样本,将目标图像样本对应的预测无阴影图像和目标图像样本对应的第一融合图像进行通道融合,得到第三融合图像,并将目标图像样本对应的无阴影图像样本和目标图像样本对应的第二融合图像进行通道融合,得到第四融合图像;对于每个第三融合图像和每个第四融合图像,将第三融合图像和第四融合图像分别输入到第二待训练判别器,得到第三判别结果和第四判别结果。
图9是本申请实施例提供的训练生成对抗网络的再一种流程示意图。图9所示的训练生成对抗网络的流程与图8所示的训练生成对抗网络的流程的不同之处在于,在图9中,通过将第一融合图像和预测无阴影图像进行通道融合得到第三融合图像,然后将第三融合图像输入第二待训练判别器,得到第三判别结果。通过将第二融合图像和无阴影图像样本进行通道融合得到第四融合图像,然后将第四融合图像输入第二待训练判别器,得到第四判别结果。也就是说,第二待训练判别器的输入分别是第三融合图 像和第四融合图像。最后,基于第三判别结果和第四判别结果对第二待训练判别生成器和第二待训练判别器进行训练。
在本申请的一些实施例中,当第二待训练判别器的输入为第三融合图像和第四融合图像时,则可以根据每个第三融合图像对应的标签、每个第三判别结果以及每个第四融合图像对应的标签和每个第四判别结果,计算LOSS D2和LOSS G2,以训练待训练生成对抗网络,具体训练过程与上述以预测无阴影图像和无阴影图像样本分别输入第二待训练判别器的场景下的训练过程相类似,在此不再赘述。
需要说明的是,在对生成对抗网络进行训练的过程中,需要对生成器和判别器均进行训练,但是在利用预先训练的生成对抗网络去除目标图像中的阴影的场景下,可以只运用预先训练的生成对抗网络中的生成器。
在一些实施例中,步骤120包括步骤1201和步骤1202。步骤1201,将目标图像输入生成对抗网络的第一生成器,得到目标图像对应的阴影图像;步骤1202,基于阴影图像和目标图像,得到去除阴影后的图像。
在本申请的实施例中,当第一待训练生成器训练完成之后,得到的即是第一生成器。当第一待训练判别器训练完成之后,得到的即是第一判别器。第一生成器的作用就是根据输入的包括阴影的目标图像生成对应的阴影图像。
在一些实施例中,为了提高去除阴影的效率,生成对抗网络中只包括第一生成器和第一判别器,则步骤1202包括从目标图像中去除阴影图像,得到去除阴影后的图像。
在本申请的实施例中,若生成对抗网络中只包括第一生成器和第一判别器,则可以节省训练生成对抗网络的时间,因此可以提高生成对抗网络的生成效率,因此可以提高去除目标图像中的阴影的效率。
在另一些实施例中,为了提高对图像中阴影去除的准确度,训练后的生成对抗网络还包括第二生成器和第二判别器。其中,在本申请的实施例 中,当第二待训练生成器训练完成之后,得到的即是第二生成器。当第二待训练判别器训练完成之后,得到的即是第二判别器。第二生成器的作用就是根据输入的目标图像和阴影图像融合得到的目标融合图像,生成无阴影图像。相应地,步骤1202包括:将阴影图像和目标图像进行通道融合,得到目标融合图像;将目标融合图像输入到第二生成器,得到去除阴影后的图像。
图10是本申请实施例提供的去除图2中的阴影之后的图像的示意图。图10示出了没有阴影的文档的图像。
在本申请的实施例中,通过将目标融合图像输入第二生成器,可以利用深度学习的方式获取无阴影图像,准确度更高。
在本申请的实施例中,通过利用三维渲染引擎获取训练样本,然后利用训练样本对生成对抗网络进行训练,以实现去除图像中阴影的目的,解决在拍摄某些图像时需要不断调整姿势的痛点。
在本申请的实施例中,结合三维渲染引擎能够模拟现实中的各种各样的真实环境,生成的数据具有广泛性;同时可以自由调节三维渲染引擎中的各项成像参数,即仿真成像条件,因此生成的训练样本数据具有多样性。并且,通过三维渲染引擎中加入真实拍摄的无阴影图像样本,可以提升生成的训练样本数据的真实性和可靠性,降低渲染时间和建模难度。再者,通过将三维渲染引擎和生成对抗网络训练进行结合能够提高该网络的训练效率,从而快速实现去除图像中阴影的目的,使图像整体更为清晰整洁。
本申请实施例提供的图像处理方法,执行主体可以为图像处理装置,或者该图像处理装置中的用于执行图像处理方法的控制模块。本申请实施例中以图像处理装置执行图像处理方法为例,说明本申请实施例提供的图像处理方法。
图11是本申请实施例提供的图像处理装置的结构示意图。如图11所示,该图像处理装置300包括:
目标图像获取模块310,用于获取包括阴影的目标图像。
阴影去除模块320,用于基于目标模型去除所述目标图像中的阴影,得到去除阴影后的图像。
其中,目标模型为基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本训练得到的模型。
目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本;阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。
在本申请实施例中,利用预设的仿真成像条件可以模拟现实中的各种各样的真实环境,而实际拍摄的无阴影图像样本是真实的图像数据,因此通过利用实际拍摄的无阴影图像样本和预设的仿真成像条件生成目标图像样本,可以将真实数据和仿真成像条件相结合,以快速得到用于训练去除图像阴影的目标模型的真实性较高的目标图像样本和目标图像样本对应的阴影图像样本,从而可以提高该目标模型的训练精度和速度。由于训练后的目标模型是基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本进行训练得到的,因此,通过利用训练得到的目标模型可以实现对目标图像中的阴影的去除,而由于目标模型的训练速度和精度都得到了提高,因此可以实现对目标图像中的阴影进行快速和准确的去除。
在一些实施例中,目标模型包括生成对抗网络。
在一些实施例中,阴影去除模块320,包括:
阴影图像确定单元,用于将目标图像输入生成对抗网络的第一生成器,得到目标图像对应的阴影图像。
阴影去除单元,用于基于阴影图像和目标图像,得到去除阴影后的图像。
在一些实施例中,为了提高去除图像中阴影的精度,生成对抗网络还包括第二生成器;
阴影去除单元用于:
将阴影图像和目标图像进行通道融合,得到目标融合图像;
将目标融合图像输入到第二生成器,得到去除阴影后的图像。
在一些实施例中,为了提高去除图像中阴影的效率,阴影去除单元用于:
从目标图像中去除阴影图像,得到去除阴影后的图像。
本申请实施例中的图像处理装置可以是装置,也可以是装置中的部件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的图像处理装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的图像处理装置能够实现图1的方法实施例中的各个过程,为避免重复,这里不再赘述。
可选的,本申请实施例还提供一种电子设备,包括处理器,存储器,存储在存储器上并可在处理器上运行的程序或指令,该程序或指令被处理器执行时实现上述图像处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要注意的是,本申请实施例中的电子设备包括上述的移动电子设备和非移动电子设备。
图12为实现本申请实施例的一种电子设备的硬件结构示意图。
该电子设备400包括但不限于:射频单元401、网络模块402、音频输出单元403、输入单元404、传感器405、显示单元406、用户输入单元407、接口单元408、存储器409、以及处理器410等部件。
本领域技术人员可以理解,电子设备400还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器410逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图12中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
处理器410:用于获取包括阴影的目标图像;基于目标模型去除所述目标图像中的阴影,得到去除阴影后的图像,其中,目标模型为基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本训练得到的模型,目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本,阴影图像样本为基于目标图像样本和无阴影图像样本确定的样本。
在本申请实施例中,利用预设的仿真成像条件可以模拟现实中的各种各样的真实环境,而实际拍摄的无阴影图像样本是真实的图像数据,因此通过利用实际拍摄的无阴影图像样本和预设的仿真成像条件生成目标图像样本,可以将真实数据和仿真成像条件相结合,以快速得到用于训练去除图像阴影的目标模型的真实性较高的目标图像样本和目标图像样本对应的阴影图像样本,从而可以提高该目标模型的训练精度和速度。由于训练后的目标模型是基于包括阴影的目标图像样本和目标图像样本对应的阴影图像样本进行训练得到的,因此,通过利用训练得到的目标模型可以实现对目标图像中的阴影的去除,而由于目标模型的训练速度和精度都得到了提高,因此可以实现对目标图像中的阴影进行快速和准确的去除。
处理器410还用于:将目标图像输入生成对抗网络的第一生成器,得到目标图像对应的阴影图像;基于阴影图像和目标图像,得到去除阴影后的图像。
在本申请的实施例中,通过获取目标图像对应的阴影图像,可以实现对目标图像中的阴影进行去除。
可选地,生成对抗网络还包括第二生成器,处理器410还用于:将阴影图像和目标图像进行通道融合,得到目标融合图像;将目标融合图像输入到第二生成器,得到去除阴影后的图像。
在本申请的一些实施例中,若生成对抗网络中只包括第一生成器和第一判别器,则可以节省训练生成对抗网络的时间,因此可以提高生成对抗网络的生成效率,因此可以提高去除目标图像中的阴影的效率。
在本申请的一些实施例中,通过将目标融合图像输入第二生成器,可以利用深度学习的方式获取无阴影图像,准确度更高。
可选地,处理器410还用于:从目标图像中去除阴影图像,得到去除阴影后的图像。
在本申请的实施例中,若生成对抗网络中只包括第一生成器和第一判别器,则可以节省训练生成对抗网络的时间,因此可以提高生成对抗网络的生成效率,因此可以提高去除目标图像中的阴影的效率。
本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述图像处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,处理器为上述实施例中的电子设备中的处理器。计算机可读存储介质的示例包括非暂态计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本发明实施例还提供一种电子设备,被配置成用于执行上述图像处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例另提供了一种芯片,芯片包括处理器和通信接口,通信接口和处理器耦合,处理器用于运行程序或指令,实现上述图像处理方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、 磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (15)

  1. 一种图像处理方法,包括:
    获取包括阴影的目标图像;
    基于目标模型去除所述目标图像中的阴影,得到去除阴影后的图像,其中,所述目标模型为基于包括阴影的目标图像样本和所述目标图像样本对应的阴影图像样本训练得到的模型,所述目标图像样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本,所述阴影图像样本为基于所述目标图像样本和所述无阴影图像样本确定的样本。
  2. 根据权利要求1所述的方法,其中,所述目标模型包括生成对抗网络。
  3. 根据权利要求2所述的方法,其中,所述基于目标模型去除所述目标图像中的阴影,得到去除阴影后的图像,包括:
    将所述目标图像输入所述生成对抗网络的第一生成器,得到所述目标图像对应的阴影图像;
    基于所述阴影图像和所述目标图像,得到去除阴影后的图像。
  4. 根据权利要求3所述的方法,其中,所述生成对抗网络还包括第二生成器;所述基于所述阴影图像和所述目标图像,得到去除阴影后的图像,包括:
    将所述阴影图像和所述目标图像进行通道融合,得到目标融合图像;
    将所述目标融合图像输入到所述第二生成器,得到去除阴影后的图像。
  5. 根据权利要求3所述的方法,其中,所述基于所述阴影图像和所述目标图像,得到去除阴影后的图像,包括:
    从所述目标图像中去除所述阴影图像,得到去除阴影后的图像。
  6. 一种图像处理装置,包括:
    目标图像获取模块,用于获取包括阴影的目标图像;
    阴影去除模块,用于基于目标模型去除所述目标图像中的阴影,得到去除阴影后的图像,其中,所述目标模型为基于包括阴影的目标图像样本和所述目标图像样本对应的阴影图像样本训练得到的模型,所述目标图像 样本为基于拍摄的无阴影图像样本和预设的仿真成像条件生成的样本,所述阴影图像样本为基于所述目标图像样本和所述无阴影图像样本确定的样本。
  7. 根据权利要求6所述的装置,其中,所述目标模型包括生成对抗网络。
  8. 根据权利要求7所述的装置,其中,所述阴影去除模块包括:
    阴影图像确定单元,用于将所述目标图像输入所述生成对抗网络的第一生成器,得到所述目标图像对应的阴影图像;
    阴影去除单元,用于基于所述阴影图像和所述目标图像,得到去除阴影后的图像。
  9. 根据权利要求8所述的装置,其中,所述生成对抗网络还包括第二生成器;
    所述阴影去除单元用于:
    将所述阴影图像和所述目标图像进行通道融合,得到目标融合图像;
    将所述目标融合图像输入到所述第二生成器,得到去除阴影后的图像。
  10. 根据权利要求8所述的装置,其中,所述阴影去除单元用于:
    从所述目标图像中去除所述阴影图像,得到去除阴影后的图像。
  11. 一种电子设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至5中任一项所述的图像处理方法的步骤。
  12. 一种电子设备,被配置为用于执行如权利要求1至5中任一项所述的图像处理方法的步骤。
  13. 一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至5中任一项所述的图像处理方法的步骤。
  14. 一种计算机程序产品,所述计算机程序产品可被处理器执行以实现如权利要求1至5中任一项所述的图像处理方法的步骤。
  15. 一种芯片,包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如权利要求1至5中任一项所述的图像处理方法的步骤。
PCT/CN2021/093754 2020-05-21 2021-05-14 图像处理方法及装置 WO2021233215A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21808679.1A EP4156081A4 (en) 2020-05-21 2021-05-14 IMAGE PROCESSING METHOD AND DEVICE
US17/987,987 US20230076026A1 (en) 2020-05-21 2022-11-16 Image processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010435861.8 2020-05-21
CN202010435861.8A CN111667420B (zh) 2020-05-21 2020-05-21 图像处理方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/987,987 Continuation US20230076026A1 (en) 2020-05-21 2022-11-16 Image processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2021233215A1 true WO2021233215A1 (zh) 2021-11-25

Family

ID=72384278

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/093754 WO2021233215A1 (zh) 2020-05-21 2021-05-14 图像处理方法及装置

Country Status (4)

Country Link
US (1) US20230076026A1 (zh)
EP (1) EP4156081A4 (zh)
CN (1) CN111667420B (zh)
WO (1) WO2021233215A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667420B (zh) * 2020-05-21 2023-10-24 维沃移动通信有限公司 图像处理方法及装置
CN112862714A (zh) * 2021-02-03 2021-05-28 维沃移动通信有限公司 图像处理方法及装置
CN113139917A (zh) * 2021-04-23 2021-07-20 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备及存储介质
CN113205530A (zh) * 2021-04-25 2021-08-03 Oppo广东移动通信有限公司 阴影区域处理方法及装置、计算机可读介质和电子设备
CN114445515B (zh) * 2022-02-14 2024-10-18 深圳市赛禾医疗技术有限公司 图像伪影去除方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978807A (zh) * 2019-04-01 2019-07-05 西北工业大学 一种基于生成式对抗网络的阴影去除方法
CN110033423A (zh) * 2019-04-16 2019-07-19 北京字节跳动网络技术有限公司 用于处理图像的方法和装置
CN111667420A (zh) * 2020-05-21 2020-09-15 维沃移动通信有限公司 图像处理方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354833B (zh) * 2015-10-12 2019-02-15 浙江宇视科技有限公司 一种阴影检测的方法和装置
CN110782409B (zh) * 2019-10-21 2023-05-09 太原理工大学 一种去除多运动物体阴影的方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978807A (zh) * 2019-04-01 2019-07-05 西北工业大学 一种基于生成式对抗网络的阴影去除方法
CN110033423A (zh) * 2019-04-16 2019-07-19 北京字节跳动网络技术有限公司 用于处理图像的方法和装置
CN111667420A (zh) * 2020-05-21 2020-09-15 维沃移动通信有限公司 图像处理方法及装置

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
See also references of EP4156081A4 *
WANG JIFENG; LI XIANG; YANG JIAN: "Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 18 June 2018 (2018-06-18), pages 1788 - 1797, XP033476143, DOI: 10.1109/CVPR.2018.00192 *
WANG, JIFENG: "Deep Learning Methods for Shadow Removal and Haze Removal", INFORMATION SCIENCE & TECHNOLOGY, CHINA MASTER’S THESES FULL-TEXT DATABASE, 15 June 2020 (2020-06-15), pages 1 - 73, XP055870069 *

Also Published As

Publication number Publication date
EP4156081A1 (en) 2023-03-29
US20230076026A1 (en) 2023-03-09
EP4156081A4 (en) 2023-12-06
CN111667420B (zh) 2023-10-24
CN111667420A (zh) 2020-09-15

Similar Documents

Publication Publication Date Title
WO2021233215A1 (zh) 图像处理方法及装置
US11756223B2 (en) Depth-aware photo editing
CN112884881B (zh) 三维人脸模型重建方法、装置、电子设备及存储介质
US11663733B2 (en) Depth determination for images captured with a moving camera and representing moving features
US20200279120A1 (en) Method, apparatus and system for liveness detection, electronic device, and storage medium
TWI752473B (zh) 圖像處理方法及裝置、電子設備和電腦可讀儲存媒體
CN114445562A (zh) 三维重建方法及装置、电子设备和存储介质
CN111080546A (zh) 一种图片处理方法及装置
WO2023217138A1 (zh) 一种参数配置方法、装置、设备、存储介质及产品
WO2022183656A1 (zh) 数据生成方法、装置、设备、存储介质及程序
US11403788B2 (en) Image processing method and apparatus, electronic device, and storage medium
CN117319790A (zh) 基于虚拟现实空间的拍摄方法、装置、设备及介质
US20240161391A1 (en) Relightable neural radiance field model
CN111754635B (zh) 纹理融合方法及装置、电子设备和存储介质
WO2024056020A1 (zh) 一种双目图像的生成方法、装置、电子设备及存储介质
CN109816791B (zh) 用于生成信息的方法和装置
US9230508B2 (en) Efficient feedback-based illumination and scatter culling
WO2023001110A1 (zh) 神经网络训练方法、装置及电子设备
CN113506320B (zh) 图像处理方法及装置、电子设备和存储介质
EP4283566A2 (en) Single image 3d photography with soft-layering and depth-aware inpainting
CN109842791A (zh) 一种图像处理方法及装置
CN112037227B (zh) 视频拍摄方法、装置、设备及存储介质
CN112861687B (zh) 用于门禁系统的口罩佩戴检测方法、装置、设备和介质
KR20180053494A (ko) 모바일 환경에서의 증강현실 게임공간 구축방법
CN111314627B (zh) 用于处理视频帧的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21808679

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021808679

Country of ref document: EP

Effective date: 20221221