[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2023066173A1 - Image processing method and apparatus, and storage medium and electronic device - Google Patents

Image processing method and apparatus, and storage medium and electronic device Download PDF

Info

Publication number
WO2023066173A1
WO2023066173A1 PCT/CN2022/125573 CN2022125573W WO2023066173A1 WO 2023066173 A1 WO2023066173 A1 WO 2023066173A1 CN 2022125573 W CN2022125573 W CN 2022125573W WO 2023066173 A1 WO2023066173 A1 WO 2023066173A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
shadow
output
neural network
processed
Prior art date
Application number
PCT/CN2022/125573
Other languages
French (fr)
Chinese (zh)
Inventor
叶平
张志伟
鲍天龙
Original Assignee
虹软科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 虹软科技股份有限公司 filed Critical 虹软科技股份有限公司
Priority to KR1020247015956A priority Critical patent/KR20240089729A/en
Publication of WO2023066173A1 publication Critical patent/WO2023066173A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/94Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to image processing technologies, and in particular, to an image processing method, device, storage medium, and electronic equipment.
  • the current shadow removal method either cannot remove the shadow completely, or loses the information of the background layer, or runs slowly, which is not conducive to the use of ordinary users.
  • An existing shadow removal method uses a neural network consisting of three modules, namely a global localization module, an appearance modeling module, and a semantic modeling module.
  • the global positioning module is responsible for detecting the shadow area and obtaining the location features of the shadow area;
  • the appearance modeling module is used to learn the characteristics of the non-shaded area, so that the output of the network is consistent with the labeled data (Ground Truth, GT) in the non-shaded area;
  • a semantic modeling module is used to restore the original content behind shadows.
  • this method does not directly output the background image after the shadow is removed, but the ratio of the shadow image to the background image. It needs to be further divided by the shadow image and the network output pixel by pixel to obtain the background image, which introduces a greater amount of calculation. At the same time Division may affect calculation stability due to the problem of division by 0.
  • Embodiments of the present application provide an image processing method, device, storage medium, and electronic equipment to at least solve the technical problems in the prior art that it is easy to eliminate shadow areas while causing side effects on the image background layer and has high requirements for hardware platforms.
  • an image processing method including: acquiring an image to be processed that includes a shadow area; inputting the image to be processed to a trained neural network to obtain a shadow-removed image; wherein, the neural network includes Two-level cascaded first-level network and second-level network, the first-level network receives the image to be processed and outputs the mask map of the shadow area, and the second-level network receives the image to be processed and the mask map of the shadow area at the same time, and outputs the image to be processed shadow image.
  • the first-level network includes: a first feature extraction module, including a first encoder, for extracting the features of the image to be processed layer by layer, and obtaining the first set of feature data; the shadow area estimation module, and the first feature extraction
  • the output connection of the module includes a first decoder for estimating the shadow area based on the first set of feature data and outputting a mask map of the shadow area.
  • the second-level network includes: a second feature extraction module, including a second encoder, connected to the output of the first-level network, and receiving the shadow area mask map output by the first-level network while receiving the image to be processed , used to obtain the second set of feature data; the result map output module, connected to the output of the second feature extraction module, includes a second decoder, used to output the shadowed image based on the second set of feature data.
  • a second feature extraction module including a second encoder, connected to the output of the first-level network, and receiving the shadow area mask map output by the first-level network while receiving the image to be processed , used to obtain the second set of feature data
  • the result map output module connected to the output of the second feature extraction module, includes a second decoder, used to output the shadowed image based on the second set of feature data.
  • each layer of the first decoder or the second decoder is spliced along the channel axis with the output of the corresponding layer of the first encoder or the second encoder through a cross-layer connection.
  • a multi-scale pyramid pooling module is added to the cross-layer connection of the decoder and the first encoder or the second encoder, and the multi-scale pyramid pooling module fuses features of different scales.
  • the image processing method further includes: using an image pyramid algorithm to downsample the image to be processed, and saving the gradient information of all levels of layers while downsampling to form a Lapla Laplacian pyramid; feed the smallest layer into the trained neural network to obtain the output image; use the Laplacian pyramid to reconstruct the output image from low resolution to high resolution to obtain the shadowed image.
  • the above-mentioned image processing method further includes: constructing an initial neural network; using sample data to train the initial neural network to obtain a trained neural network, wherein the sample data includes a real shot image and a synthetic shadow image, and the synthetic shadow image Composite with pure shadow and no shadow maps using image compositing methods.
  • using the image synthesis method to synthesize the above composite shadow image with the pure shadow image and the no shadow image includes: obtaining the pure shadow image; obtaining the no shadow image; and obtaining the composite shadow image based on the pure shadow image and the no shadow image.
  • using the image synthesis method to synthesize the above composite shadow image with the pure shadow image and the no shadow image further includes: transforming the pure shadow image, and obtaining a composite shadow image based on the transformed pure shadow image and the no shadow image, wherein, The pixel values of the non-shaded areas in the transformed pure shadow image are uniformly set to a fixed value a, and the pixel values of the shadowed areas are values between 0 and a, where a is a positive integer.
  • the initial neural network also includes a module for classifying the sample data.
  • the marked data is a shadow-removed image collected in real scene, and according to the output of the initial neural network
  • the difference between the shadow-removed image and the shadow-removed image as labeled data adjusts the parameters inside the second-level network;
  • the labeled data includes the unshaded image and Pure shadow map, adjust the parameters inside the first-level network according to the difference between the mask map of the shadow area and the pure shadow map, and adjust the internal parameters of the second-level network according to the difference between the unshaded image and the unshaded image output by the initial neural network parameters.
  • the loss function includes at least one of the following: pixel loss, feature loss, structural similarity loss, confrontation loss, shadow edge loss, and shadow brightness loss.
  • the pixel loss includes a pixel truncation loss.
  • the loss of two pixels is calculated; when the output of the initial neural network
  • the absolute difference between the corresponding two pixels in the image and the label image is not greater than a given threshold, the difference between the two pixels is ignored.
  • the shadow brightness is lost, so that the brightness difference between the brightness of the area corresponding to the shadow area in the shadow removal image output by the neural network and the shadow area in the input image to be processed is greater than 0, which is used to improve the shadow removal in the shadow image.
  • the brightness of the area corresponding to the shaded area is used to improve the shadow removal in the shadow image.
  • the above image processing method includes: performing expansion processing on the shadow area mask image to obtain an expansion image; performing erosion processing on the shadow area mask image to obtain an erosion image; obtaining an expansion image and the difference of the erosion map as the shaded and unshaded boundary regions, and smoothed using TVLoss
  • an image processing device including: an image acquisition unit, used to acquire an image to be processed including a shadow area; a processing unit, used to receive the image to be processed, and use the trained The neural network of the to-be-processed image is processed to obtain the shadow-removed image; wherein, the neural network includes a two-level cascaded first-level network and a second-level network, and the first-level network receives the image to be processed and outputs a shadow area mask map, The second-level network simultaneously receives the image to be processed and the shadow area mask map, and outputs a deshaded image.
  • the first-level network includes: a first feature extraction module, including a first encoder, for extracting the features of the image to be processed layer by layer, and obtaining the first set of feature data; the shadow area estimation module, and the first feature extraction
  • the output connection of the module includes a first decoder for estimating the shadow area based on the first set of feature data and outputting a mask map of the shadow area.
  • the second-level network includes: a second feature extraction module, including a second encoder, connected to the output of the first-level network, and receiving the shadow area mask map output by the first-level network while receiving the image to be processed , used to obtain the second set of feature data; the result map output module, connected to the output of the second feature extraction module, includes a second decoder, used to output the shadowed image based on the second set of feature data.
  • a second feature extraction module including a second encoder, connected to the output of the first-level network, and receiving the shadow area mask map output by the first-level network while receiving the image to be processed , used to obtain the second set of feature data
  • the result map output module connected to the output of the second feature extraction module, includes a second decoder, used to output the shadowed image based on the second set of feature data.
  • a storage medium including a stored program, wherein, when the program is running, the device where the storage medium is located is controlled to execute any one of the image processing methods described above.
  • an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor is configured to execute the Instructions can be executed to execute the image processing method described in any one of the above.
  • This application proposes a fast and effective shadow elimination method applicable to mobile terminals such as mobile phones, which captures the characteristics of the physical phenomenon of shadows, synthesizes training materials with a strong sense of reality, and combines a variety of different loss functions Training with effective network structure and modules to achieve better shadow elimination.
  • this application uses down-sampling technology and network pruning technology. The graph can still achieve very fast processing speed.
  • Fig. 1 is a flow chart of an optional image processing method according to an embodiment of the present application
  • FIG. 2 is a structural diagram of an optional neural network according to an embodiment of the present application.
  • FIG. 3 is a flowchart of an optional training neural network according to an embodiment of the present application.
  • FIG. 4 is a flow chart of an optional image synthesis method according to an embodiment of the present application.
  • Fig. 5 (a) and Fig. 5 (b) are the comparison diagrams of the effect of removing shadows by using the image processing method of the embodiment of the present application;
  • Fig. 6 is a structural block diagram of an optional image processing apparatus according to an embodiment of the present application.
  • FIG. 1 it is a flowchart of an optional image processing method according to an embodiment of the present application. As shown in Figure 1, the image processing method includes the following steps:
  • the neural network includes a two-stage cascaded first-level network and a second-level network, and the first-level network receives the image to be processed and outputs a shadow Area mask map, the second-level network receives the image to be processed and the shadow area mask map at the same time, and outputs the shadowed image.
  • the neural network includes a two-stage cascaded first-level network 20 and a second-level network 22, and the first-level network includes a first feature extraction module 200 and a shaded area
  • the estimation module 202 the second-level network includes a second feature extraction module 204 and a result map output module 206 .
  • the first feature extraction module 200 includes a first encoder for extracting the features of the image to be processed layer by layer to obtain the first set of feature data;
  • the shadow area estimation module 202 is connected to the output of the first feature extraction module 200, Including a first decoder for estimating the shadow area based on the first set of feature data and outputting a mask map for the shadow area;
  • the second feature extraction module 204 includes a second encoder connected to the output of the first-level network, receiving the While processing the image, receive the shaded area mask map output by the first-level network to obtain the second set of feature data;
  • the result map output module 206 is connected to the output of the second feature extraction module 204, includes a second decoder, and uses and outputting a deshaded image based on the second set of feature data.
  • the first-level network and the second-level network have the same structure except for the number of input channels. For example, they can be constructed based on the classic segmentation network UNet.
  • the outputs of each layer of the two encoders are respectively concatenated with the outputs of the corresponding layers of the two decoders along the channel axis through cross-layer connections.
  • the multi-scale pyramid pooling module includes multiple pooling layers of different kernel sizes, convolutional layers, and interpolation upsampling layers. First, features of different scales are extracted through the pooling layer, and then low-level and/or high-level features are extracted through the convolutional layer. Then the output of the corresponding layer of the encoder and decoder is adjusted to the same size through the interpolation upsampling layer, and finally stitched into a feature along the channel axis.
  • the multi-scale pyramid pooling module integrates features of different scales, which enhances the generalization of the network and enables the network to achieve better results on shadow maps of different areas and degrees.
  • the model can be pruned, and the convolutional layer in the encoder is replaced by grouped convolution.
  • Each convolution kernel only convolves one channel, thereby reducing the amount of calculation of the model. , to increase processing speed.
  • an instance regularization layer is added after the convolutional layers of the encoder and decoder to regularize the features, thereby improving the shadow removal effect.
  • the image pyramid algorithm can be used to downsample the image to be processed first, and the gradient information of all levels of layers can be saved while downsampling Form the Laplacian pyramid, and then send the layer with the smallest pyramid size to the trained neural network to obtain the output image; finally, use the Laplacian pyramid to reconstruct the output image, because the gradient information in the shadow area is weak, Therefore, even if the reconstruction process restores some gradient information of the image to be processed, it will not affect the shadow removal effect.
  • the image is reconstructed by using the gradient information of all levels of layers saved during downsampling, so as to eliminate shadows without affecting the image resolution.
  • the image processing method also includes:
  • S302 Using sample data to train the initial neural network to obtain a trained neural network, wherein the sample data includes a real-shot image and a synthetic shadow image, and the synthetic shadow image is synthesized from a pure shadow image and a no-shadow image.
  • the sample data used to train the initial neural network plays a vital role in the whole image processing method, and there are mainly two methods for obtaining sample data: real scene acquisition and image synthesis.
  • the acquisition personnel select the corresponding light environment and shooting objects according to the scene category (for example, different lighting scenes, warm light, cold light, daylight, etc.), fix the mobile phone or camera and other shooting devices with a tripod, adjust Appropriate light direction and focal length, using palms, mobile phones or other common objects as occluders for shading, forming shadows on the subject and shooting to obtain a shadow image, and then removing the occluder and shooting again to obtain a shadow-free background image, thus obtaining paired sample data.
  • the scene category for example, different lighting scenes, warm light, cold light, daylight, etc.
  • image synthesis methods can be used to generate realistic synthetic shadow maps for the training of neural networks.
  • the image synthesis method includes:
  • the data collector lays a piece of white paper on the desktop under the preset light environment, uses palms, mobile phones or other common objects to block the light, and leaves a pure shadow image on the white paper S, where all or part of the pure shadow map S is a shadow area;
  • a is a positive integer.
  • the data collectors take the shadow-free images B of various objects in the above-mentioned same light environment
  • the pure shadow map S (or transformed pure shadow map S′) is multiplied pixel by pixel by the non-shade map B to obtain a composite shadow map.
  • This image synthesis method takes into account the weakening effect of shadows on light, and can better handle shadows with gentle edge transitions, and has a strong sense of reality.
  • the initial neural network also includes a module for classifying the sample data.
  • Truth, GT is the shadow removal image collected in the real scene. Since the shadow area mask map of the real shot image cannot be adjusted, it can be based on the difference between the shadow removal image output by the initial neural network and the shadow removal image as the labeled data GT.
  • the label data (Ground Truth, GT) includes the unshaded image and the pure shadow image collected in real scene, according to the shadow area mask
  • the difference between the model image and the pure shadow image adjusts the parameters inside the first-level network 20, and adjusts the parameters inside the second-level network 22 according to the difference between the shadow-removed image output by the initial neural network and the unshaded image as labeled data .
  • the sample data acquisition method may also include one or more processes such as random flipping, rotation, color temperature adjustment, channel exchange, and adding random noise to the acquired sample data, so that the sample data is more accurate. For enrichment, increase the robustness of the network.
  • the loss function when performing supervised training on the initial neural network, includes at least one of the following: pixel loss, feature loss, structural similarity loss, and adversarial loss.
  • the pixel loss function is a function to measure the similarity of two images from the pixel level of the image, mainly including image pixel value loss and gradient loss. In this embodiment, it mainly refers to the weighted sum of the pixel value mean square error of the comparison between the output image of the initial neural network and the label image and the L1 norm error of the gradient of the two images.
  • the pixel loss supervises the training process from the pixel level, so that the pixel value of each pixel of the output image of the initial neural network and the label image is as close as possible.
  • a pixel truncation loss can be introduced to truncate the pixel loss, namely When the absolute difference between two pixels is greater than a given threshold, the loss of two pixels is calculated, otherwise the difference between two pixels is ignored. After adding the pixel truncation loss, it can guide the network to pay attention to the shadow area and suppress the noise of the image. Not only the effect of shadow removal is enhanced, but the convergence speed of the network is also greatly accelerated.
  • the feature loss mainly refers to the weighted sum of the L1 norm error of the input image of the initial neural network and the corresponding features of the label image.
  • the VGG19 network pre-trained on the ImageNet data set is used as a feature extractor, and the output image and label image of the initial neural network are respectively sent to the feature extractor to obtain the features of each layer of VGG19 Then calculate the L1 norm error of the corresponding features of the input image and the label image and weight the summation.
  • the features of each layer of VGG19 are not sensitive to image details and noise, and have good semantic characteristics. Therefore, even if the input image and output image have defects such as noise or misalignment, feature loss can still accurately generate effective differences in shadow areas. It makes up for the lack of sensitivity of pixel loss to noise and has good stability.
  • the structural similarity loss function is a function to measure the similarity of two images according to the global features of the images. In this embodiment, it mainly refers to the global difference in brightness and contrast between the output image of the initial neural network and the label image. Adding this loss function can effectively suppress the color cast of the network output and improve the overall quality of the image.
  • Adversarial loss mainly refers to the output of the discriminator and the loss value of the true category of the output image.
  • Adversarial loss mainly refers to the output of the discriminator and the loss value of the true category of the output image.
  • the effects of pixel loss, feature loss, and structural similarity loss will gradually become smaller, and the network convergence will slow down.
  • a discriminator network is trained synchronously for the training of the auxiliary network.
  • the output image and label image of the initial neural network are sent to the discriminator, and the discriminator judges whether the output image is a label image, calculates the loss and updates the discriminator parameters according to the output result of the discriminator and the true category of the output image; then The discrimination result of the discriminator on the output image is taken as the loss of the authenticity of the output image, and the parameters of the discriminator are updated with the loss.
  • Training ends when the discriminator cannot distinguish between the output image of the initial neural network and the label image.
  • the adversarial loss can effectively eliminate the image side effects caused by network processing (for example, the problem of color inconsistency between shadow and non-shadow area, shadow residual problem, etc.), and improve the realism of the network output image.
  • Threshold truncation loss Due to the influence of lighting, the paired data collected in the real scene may also have slight brightness differences and color changes in non-shaded areas, and these differences are acceptable to users and do not need to be processed. Therefore, in the training process, in order to prevent the network's attention from focusing on these global small differences, the method introduces a threshold truncation loss, that is, only when the difference between the output of the network and GT is greater than a given threshold, the difference is aggregated. Include the gradient of the overall loss calculation parameters, otherwise the loss is considered to be 0. This loss function tolerates the slight difference between the output of the network and GT, and shifts the focus of network learning to areas with large differences, thus effectively improving the network's ability to eliminate obvious shadows.
  • Shadow edge loss First, expand the mask image of the shadow area to obtain an expansion map; secondly, perform erosion processing on the mask image of the shadow area to obtain a corrosion map; then, obtain the difference between the expansion map and the corrosion map as the result of obtaining shadow and non-shadow Boundary areas, and smoothed with TVLoss, can effectively transition between shadow and non-shadow areas.
  • Shadow brightness loss so that the brightness difference between the brightness of the area corresponding to the shadow area in the shadow removal map output by the neural network and the shadow area in the input image to be processed is greater than 0, which is used to improve the shadow area corresponding to the shadow area in the shadow removal image.
  • the brightness of the area is used to improve the shadow area corresponding to the shadow area in the shadow removal image.
  • the output module of the background layer of the initial neural network uses the weighted sum of all the above losses as the total loss, and uses the Wassertein generation confrontation network as the confrontation loss.
  • the network structure extracts the global features and local features of the input image, improves the degree of shadow elimination, and protects non-shadow areas from side effects.
  • Fig. 5(a) and Fig. 5(b) are the comparison diagrams of the processing effects realized by the image processing method of the embodiment of the present application, wherein Fig. 5(a) is an image to be processed containing shadows, and Fig. 5(b) is the processed image after The shadow-removed image processed by the image processing method can be seen from the comparison of the two images.
  • the image processing method provided by this application can effectively eliminate the shadow without causing significant side effects on the background layer.
  • the neural network structure and loss function used in the embodiment of the present application can also be applied in application scenarios such as removing shadows, removing rain and fog, and is mainly used to process high-resolution images taken by mobile terminals such as mobile phones, but it is also applicable to PC or Handle images of various resolutions in other embedded devices.
  • an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor is configured to execute the above-mentioned Any image processing method.
  • the storage medium includes a stored program, wherein when the program is running, the device where the storage medium is located is controlled to execute any one of the above image processing methods.
  • an image processing device is also provided.
  • Fig. 6 it is a structural block diagram of an optional image processing device according to an embodiment of the present application.
  • the image processing device 60 includes an image acquisition unit 600 and a processing unit 602 .
  • the image acquisition unit 600 is configured to acquire the image to be processed including the shaded area.
  • the processing unit 602 is configured to receive an image to be processed, and use a trained neural network to process the image to be processed to obtain a shadow-removed image, wherein the neural network includes a two-stage cascaded first-level network and a second-level network, to be The processed image and the output image of the first-level network are simultaneously input to the second-level network.
  • the structure of the neural network is shown in FIG. 2 and related descriptions herein, and will not be repeated here.
  • the disclosed technical content can be realized in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units may be a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for enabling a computer device (which may be a personal computer, server or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Disclosed in the present application are an image processing method and apparatus, and a storage medium and an electronic device. The image processing method comprises: acquiring an image to be processed, which includes a shadow area; and inputting said image into a trained neural network, so as to obtain an image that has been subjected to shadow removal, wherein the neural network includes a first-stage network and a second-stage network, which are cascaded in two stages, the first-stage network receives the image to be processed and outputs a shadow area mask map, and the second-stage network simultaneously receives the image to be processed and the shadow area mask map and outputs the image that has been subjected to shadow removal. By means of the present application, the technical problems in the prior art of a side effect being easily generated on an image background layer during the removal of a shadow area, and having high requirements for a hardware platform can be solved.

Description

图像处理方法、装置及存储介质、电子设备Image processing method, device, storage medium, and electronic equipment
本申请要求于2021年10月18日递交的中国专利申请第202111210502.3的优先权,在此全文援引该中国专利申请的内容作为本申请的部分。This application claims the priority of the Chinese patent application No. 202111210502.3 submitted on October 18, 2021, and the content of the Chinese patent application is cited in its entirety as a part of this application.
技术领域technical field
本申请涉及图像处理技术,具体而言,涉及一种图像处理方法、装置及存储介质、电子设备。The present application relates to image processing technologies, and in particular, to an image processing method, device, storage medium, and electronic equipment.
背景技术Background technique
当人们用手机拍摄文档时,经常会由于手和手机对光线的遮挡以及环境中其它物体对光线的遮挡而在文档上留下阴影,从而影响拍摄出来的图像的视觉体验,通过计算机视觉处理技术对拍摄后的图像进行处理,消除阴影,恢复出阴影背后的文字和图画内容可以有效提高图像的质量,因此,文档阴影消除是一项有重要意义的技术,能够较大地提高摄制的图像的质量,具有广阔的市场前景。When people use mobile phones to shoot documents, shadows are often left on the documents due to the occlusion of light by hands and mobile phones and the occlusion of light by other objects in the environment, which affects the visual experience of the captured images. Through computer vision processing technology Processing the image after shooting, eliminating shadows, and restoring the text and picture content behind the shadows can effectively improve the quality of the image. Therefore, document shadow removal is an important technology that can greatly improve the quality of the captured image , has broad market prospects.
有效地消除阴影层的同时不对背景层产生显著的副作用,同时有较快的运行速度和可以接受的硬件配置要求,是阴影消除方法应用在手机上的基本需求和主要的挑战,当前的阴影消除方法要么无法将阴影去除干净,要么会损失背景层的信息,要么运行速度慢,均不利于普通用户的使用。Effectively eliminate the shadow layer without causing significant side effects on the background layer, and at the same time have a faster running speed and acceptable hardware configuration requirements, which are the basic requirements and main challenges for the shadow removal method to be applied to mobile phones. The current shadow removal method The method either cannot remove the shadow completely, or loses the information of the background layer, or runs slowly, which is not conducive to the use of ordinary users.
现有的一种阴影消除方法使用包含三个模块的神经网络,分别是全局定位模块、外观建模模块和语义建模模块。全局定位模块负责对阴影区域进行检测,获取阴影区域的位置特征;外观建模模块用于学习非阴影区域的特征,使得网络的输出与标注数据(Ground Truth,GT)在非阴影区域保持一致;语义建模模块用于恢复阴影背后的原始内容。但是该方法并非直接输出消除阴影后的背景图,而是阴影图与背景图的比值,需要进一步用阴影图与网络输出逐像素相除以得到背景图,从而引入了更大的计算量,同时除法可能因为被0除的问题,影响计算稳定性。An existing shadow removal method uses a neural network consisting of three modules, namely a global localization module, an appearance modeling module, and a semantic modeling module. The global positioning module is responsible for detecting the shadow area and obtaining the location features of the shadow area; the appearance modeling module is used to learn the characteristics of the non-shaded area, so that the output of the network is consistent with the labeled data (Ground Truth, GT) in the non-shaded area; A semantic modeling module is used to restore the original content behind shadows. However, this method does not directly output the background image after the shadow is removed, but the ratio of the shadow image to the background image. It needs to be further divided by the shadow image and the network output pixel by pixel to obtain the background image, which introduces a greater amount of calculation. At the same time Division may affect calculation stability due to the problem of division by 0.
因此,有必要提出一种图像处理技术,能够在有效消除阴影的同时不对背景层产生显著的副作用,同时有较快的运行速度和可以接受的硬件配置要求。Therefore, it is necessary to propose an image processing technology that can effectively eliminate shadows without producing significant side effects on the background layer, and at the same time have a faster running speed and acceptable hardware configuration requirements.
发明内容Contents of the invention
本申请实施例提供了一种图像处理方法、装置及存储介质、电子设备,以至少解决现有技术中容易在消除阴影区域的同时对图像背景层产生副作用且对硬件平台要求高的技术问 题。Embodiments of the present application provide an image processing method, device, storage medium, and electronic equipment to at least solve the technical problems in the prior art that it is easy to eliminate shadow areas while causing side effects on the image background layer and has high requirements for hardware platforms.
根据本申请实施例的一个方面,提供了一种图像处理方法,包括:获取包含阴影区域的待处理图像;将待处理图像输入至经过训练的神经网络,获得去阴影图像;其中,神经网络包含两级级联的第一级网络和第二级网络,第一级网络接收待处理图像并输出阴影区域掩模图,第二级网络同时接收待处理图像和阴影区域掩模图,并输出去阴影图像。According to an aspect of an embodiment of the present application, an image processing method is provided, including: acquiring an image to be processed that includes a shadow area; inputting the image to be processed to a trained neural network to obtain a shadow-removed image; wherein, the neural network includes Two-level cascaded first-level network and second-level network, the first-level network receives the image to be processed and outputs the mask map of the shadow area, and the second-level network receives the image to be processed and the mask map of the shadow area at the same time, and outputs the image to be processed shadow image.
可选地,第一级网络包括:第一特征提取模块,包含第一编码器,用于逐层提取待处理图像的特征,获得第一组特征数据;阴影区域估计模块,与第一特征提取模块的输出连接,包含第一解码器,用于基于第一组特征数据估计阴影区域并输出阴影区域掩模图。Optionally, the first-level network includes: a first feature extraction module, including a first encoder, for extracting the features of the image to be processed layer by layer, and obtaining the first set of feature data; the shadow area estimation module, and the first feature extraction The output connection of the module includes a first decoder for estimating the shadow area based on the first set of feature data and outputting a mask map of the shadow area.
可选地,第二级网络包括:第二特征提取模块,包含第二编码器,与第一级网络的输出连接,在接收待处理图像的同时接收第一级网络输出的阴影区域掩模图,用于获得第二组特征数据;结果图输出模块,与第二特征提取模块的输出相连,包含第二解码器,用于基于第二组特征数据输出去阴影图像。Optionally, the second-level network includes: a second feature extraction module, including a second encoder, connected to the output of the first-level network, and receiving the shadow area mask map output by the first-level network while receiving the image to be processed , used to obtain the second set of feature data; the result map output module, connected to the output of the second feature extraction module, includes a second decoder, used to output the shadowed image based on the second set of feature data.
可选地,第一解码器或第二解码器各层的输出通过跨层连接与第一编码器或第二编码器对应层的输出沿着通道轴进行拼接,在第一解码器或第二解码器以及第一编码器或第二编码器的跨层连接上添加多尺度金字塔池化模块,多尺度金字塔池化模块将不同尺度的特征进行融合。Optionally, the output of each layer of the first decoder or the second decoder is spliced along the channel axis with the output of the corresponding layer of the first encoder or the second encoder through a cross-layer connection. A multi-scale pyramid pooling module is added to the cross-layer connection of the decoder and the first encoder or the second encoder, and the multi-scale pyramid pooling module fuses features of different scales.
可选地,在获取包含阴影区域的待处理图像之后,图像处理方法还包括:采用图像金字塔算法对待处理图像进行降采样,并且在降采样的同时保存各级图层的梯度信息形成拉普拉斯金字塔;将尺寸最小的图层送入经过训练的神经网络,获得输出图像;使用拉普拉斯金字塔对输出图像进行低分辨率到高分辨率的重建,获得去阴影图像。Optionally, after acquiring the image to be processed including the shaded area, the image processing method further includes: using an image pyramid algorithm to downsample the image to be processed, and saving the gradient information of all levels of layers while downsampling to form a Lapla Laplacian pyramid; feed the smallest layer into the trained neural network to obtain the output image; use the Laplacian pyramid to reconstruct the output image from low resolution to high resolution to obtain the shadowed image.
可选地,上述图像处理方法,还包括:构建初始神经网络;使用样本数据对初始神经网络进行训练,获得经过训练的神经网络,其中,样本数据包括实拍图和合成阴影图,合成阴影图使用图像合成方法用纯阴影图和无阴影图合成。Optionally, the above-mentioned image processing method further includes: constructing an initial neural network; using sample data to train the initial neural network to obtain a trained neural network, wherein the sample data includes a real shot image and a synthetic shadow image, and the synthetic shadow image Composite with pure shadow and no shadow maps using image compositing methods.
可选地,使用图像合成方法用纯阴影图和无阴影图合成上述合成阴影图包括:获取纯阴影图;获取无阴影图;基于纯阴影图和无阴影图,获得合成阴影图。Optionally, using the image synthesis method to synthesize the above composite shadow image with the pure shadow image and the no shadow image includes: obtaining the pure shadow image; obtaining the no shadow image; and obtaining the composite shadow image based on the pure shadow image and the no shadow image.
可选地,使用图像合成方法用纯阴影图和无阴影图合成上述合成阴影图还包括:对纯阴影图进行变换,基于经过变换的纯阴影图与无阴影图,获得合成阴影图,其中,所述经过变换的纯阴影图中非阴影区域的像素值统一设置为一个固定数值a,阴影区域的像素值则为0~a之间的数值,a为正整数。Optionally, using the image synthesis method to synthesize the above composite shadow image with the pure shadow image and the no shadow image further includes: transforming the pure shadow image, and obtaining a composite shadow image based on the transformed pure shadow image and the no shadow image, wherein, The pixel values of the non-shaded areas in the transformed pure shadow image are uniformly set to a fixed value a, and the pixel values of the shadowed areas are values between 0 and a, where a is a positive integer.
可选地,初始神经网络还包括对样本数据进行类别判断的模块,当判断出输入初始神经网络的样本数据为实拍图时,标注数据为实景采集的去阴影图像,根据初始神经网络输出的 去阴影图像和作为标注数据的去阴影图像之间的差异调整第二级网络内部的参数;当判断出输入初始神经网络的样本数据为合成阴影图时,标注数据包括实景采集的无阴影图像和纯阴影图,根据阴影区域掩模图和纯阴影图之间的差异调整第一级网络内部的参数,根据初始神经网络输出的去阴影图像和无阴影图像之间的差异调整第二级网络内部的参数。Optionally, the initial neural network also includes a module for classifying the sample data. When it is judged that the sample data input to the initial neural network is a real picture, the marked data is a shadow-removed image collected in real scene, and according to the output of the initial neural network The difference between the shadow-removed image and the shadow-removed image as labeled data adjusts the parameters inside the second-level network; when it is judged that the sample data input to the initial neural network is a synthetic shadow image, the labeled data includes the unshaded image and Pure shadow map, adjust the parameters inside the first-level network according to the difference between the mask map of the shadow area and the pure shadow map, and adjust the internal parameters of the second-level network according to the difference between the unshaded image and the unshaded image output by the initial neural network parameters.
可选地,使用样本数据对初始神经网络进行训练时,损失函数包含以下至少一项:像素损失、特征损失、结构相似性损失、对抗损失、阴影边缘损失、阴影亮度损失。Optionally, when using sample data to train the initial neural network, the loss function includes at least one of the following: pixel loss, feature loss, structural similarity loss, confrontation loss, shadow edge loss, and shadow brightness loss.
可选地,像素损失包含像素截断损失,当初始神经网络的输出图像和标签图像中对应的两个像素的绝对差值大于给定阈值时,计算两个像素的损失;当初始神经网络的输出图像和标签图像中对应的两个像素的绝对差值不大于给定阈值时,忽略两个像素的差异。Optionally, the pixel loss includes a pixel truncation loss. When the absolute difference between the corresponding two pixels in the output image of the initial neural network and the label image is greater than a given threshold, the loss of two pixels is calculated; when the output of the initial neural network When the absolute difference between the corresponding two pixels in the image and the label image is not greater than a given threshold, the difference between the two pixels is ignored.
可选地,阴影亮度损失,使得神经网络输出的去阴影图中与阴影区域对应的区域的亮度与输入的待处理图像中的阴影区域的亮度差值大于0,用于提升去阴影图像中与阴影区域对应的区域的亮度。Optionally, the shadow brightness is lost, so that the brightness difference between the brightness of the area corresponding to the shadow area in the shadow removal image output by the neural network and the shadow area in the input image to be processed is greater than 0, which is used to improve the shadow removal in the shadow image. The brightness of the area corresponding to the shaded area.
可选地,当损失函数包括阴影边缘损失时,上述图像处理方法包括:对阴影区域掩模图做膨胀处理,获得膨胀图;对阴影区域掩模图做腐蚀处理,获得腐蚀图;获取膨胀图和腐蚀图的差集作为阴影和非阴影的边界区域,并使用TVLoss进行平滑Optionally, when the loss function includes shadow edge loss, the above image processing method includes: performing expansion processing on the shadow area mask image to obtain an expansion image; performing erosion processing on the shadow area mask image to obtain an erosion image; obtaining an expansion image and the difference of the erosion map as the shaded and unshaded boundary regions, and smoothed using TVLoss
根据本申请实施例的另一方面,还提供了一种图像处理装置,包括:图像采集单元,用于获取包含阴影区域的待处理图像;处理单元,用于接收待处理图像,并使用经过训练的神经网络对待处理图像进行处理,获得去阴影图像;其中,神经网络包含两级级联的第一级网络和第二级网络,第一级网络接收待处理图像并输出阴影区域掩模图,第二级网络同时接收待处理图像和阴影区域掩模图,并输出去阴影图像。According to another aspect of the embodiment of the present application, there is also provided an image processing device, including: an image acquisition unit, used to acquire an image to be processed including a shadow area; a processing unit, used to receive the image to be processed, and use the trained The neural network of the to-be-processed image is processed to obtain the shadow-removed image; wherein, the neural network includes a two-level cascaded first-level network and a second-level network, and the first-level network receives the image to be processed and outputs a shadow area mask map, The second-level network simultaneously receives the image to be processed and the shadow area mask map, and outputs a deshaded image.
可选地,第一级网络包括:第一特征提取模块,包含第一编码器,用于逐层提取待处理图像的特征,获得第一组特征数据;阴影区域估计模块,与第一特征提取模块的输出连接,包含第一解码器,用于基于第一组特征数据估计阴影区域并输出阴影区域掩模图。Optionally, the first-level network includes: a first feature extraction module, including a first encoder, for extracting the features of the image to be processed layer by layer, and obtaining the first set of feature data; the shadow area estimation module, and the first feature extraction The output connection of the module includes a first decoder for estimating the shadow area based on the first set of feature data and outputting a mask map of the shadow area.
可选地,第二级网络包括:第二特征提取模块,包含第二编码器,与第一级网络的输出连接,在接收待处理图像的同时接收第一级网络输出的阴影区域掩模图,用于获得第二组特征数据;结果图输出模块,与第二特征提取模块的输出相连,包含第二解码器,用于基于第二组特征数据输出去阴影图像。Optionally, the second-level network includes: a second feature extraction module, including a second encoder, connected to the output of the first-level network, and receiving the shadow area mask map output by the first-level network while receiving the image to be processed , used to obtain the second set of feature data; the result map output module, connected to the output of the second feature extraction module, includes a second decoder, used to output the shadowed image based on the second set of feature data.
根据本申请实施例的另一方面,还提供了一种存储介质,包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行上述任意一项所述的图像处理方法。According to another aspect of the embodiments of the present application, there is also provided a storage medium, including a stored program, wherein, when the program is running, the device where the storage medium is located is controlled to execute any one of the image processing methods described above.
根据本申请实施例的另一方面,还提供了一种电子设备,包括:处理器;以及存储器,用于存储所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执 行上述任意一项所述的图像处理方法。According to another aspect of the embodiments of the present application, there is also provided an electronic device, including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor is configured to execute the Instructions can be executed to execute the image processing method described in any one of the above.
本申请提出一种速度快、效果好的可适用于手机等移动终端的阴影消除方法,抓住阴影这一物理现象的特点,合成具有强烈真实感的训练素材,同时结合多种不同的损失函数和有效的网络结构与模块进行训练,实现效果较好的阴影消除,针对手机等移动终端拍摄的图像分辨率高的特点,本申请采用了降采样技术和网络剪枝技术,在高分辨率的图上依然能达到很快的处理速度。This application proposes a fast and effective shadow elimination method applicable to mobile terminals such as mobile phones, which captures the characteristics of the physical phenomenon of shadows, synthesizes training materials with a strong sense of reality, and combines a variety of different loss functions Training with effective network structure and modules to achieve better shadow elimination. In view of the high resolution of images captured by mobile terminals such as mobile phones, this application uses down-sampling technology and network pruning technology. The graph can still achieve very fast processing speed.
附图说明Description of drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The schematic embodiments and descriptions of the application are used to explain the application and do not constitute an improper limitation to the application. In the attached picture:
图1是根据本申请实施例的一种可选的图像处理方法的流程图;Fig. 1 is a flow chart of an optional image processing method according to an embodiment of the present application;
图2是根据本申请实施例的一种可选的神经网络的结构图;FIG. 2 is a structural diagram of an optional neural network according to an embodiment of the present application;
图3是根据本申请实施例的一种可选的训练神经网络的流程图;FIG. 3 is a flowchart of an optional training neural network according to an embodiment of the present application;
图4是根据本申请实施例的一种可选的图像合成方法的流程图;FIG. 4 is a flow chart of an optional image synthesis method according to an embodiment of the present application;
图5(a)和图5(b)是采用本申请实施例的图像处理方法实现去阴影的效果对比图;Fig. 5 (a) and Fig. 5 (b) are the comparison diagrams of the effect of removing shadows by using the image processing method of the embodiment of the present application;
图6是根据本申请实施例的一种可选的图像处理装置的结构框图。Fig. 6 is a structural block diagram of an optional image processing apparatus according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的顺序在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the sequences so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
下面说明本申请实施例的一种可选的图像处理方法的流程图。需要说明的是,在附图流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The following describes a flow chart of an optional image processing method in the embodiment of the present application. It should be noted that the steps shown in the flowcharts of the accompanying drawings may be implemented in a computer system, such as a set of computer-executable instructions, and, although a logical order is shown in the flowcharts, in some cases, the The steps shown or described are performed in an order different than here.
参考图1,是根据本申请实施例的一种可选的图像处理方法的流程图。如图1所示,该图像处理方法包括如下步骤:Referring to FIG. 1 , it is a flowchart of an optional image processing method according to an embodiment of the present application. As shown in Figure 1, the image processing method includes the following steps:
S100,获取包含阴影区域的待处理图像;S100, acquire the image to be processed including the shaded area;
S102,将待处理图像输入至经过训练的神经网络,获得去阴影图像;其中,神经网络包含两级级联的第一级网络和第二级网络,第一级网络接收待处理图像并输出阴影区域掩模图,第二级网络同时接收待处理图像和阴影区域掩模图,并输出去阴影图像。S102, inputting the image to be processed into the trained neural network to obtain a shadow-removed image; wherein, the neural network includes a two-stage cascaded first-level network and a second-level network, and the first-level network receives the image to be processed and outputs a shadow Area mask map, the second-level network receives the image to be processed and the shadow area mask map at the same time, and outputs the shadowed image.
通过上述图像处理方法,可以得到准确的阴影区域边界,并且获得的去阴影图像能够在阴影和非阴影之间平滑过渡。Through the above-mentioned image processing method, accurate shadow area boundaries can be obtained, and the obtained shadow-removed image can smoothly transition between shadow and non-shadow.
在一种可选的实施例中,如图2所示,神经网络包含两级级联的第一级网络20和第二级网络22,第一级网络包括第一特征提取模块200和阴影区域估计模块202,第二级网络包括第二特征提取模块204和结果图输出模块206。其中,第一特征提取模块200,包含第一编码器,用于逐层提取待处理图像的特征,获得第一组特征数据;阴影区域估计模块202,与第一特征提取模块200的输出连接,包含第一解码器,用于基于第一组特征数据估计阴影区域并输出阴影区域掩模图;第二特征提取模块204,包含第二编码器,与第一级网络的输出连接,在接收待处理图像的同时接收第一级网络输出的阴影区域掩模图,用于获得第二组特征数据;结果图输出模块206,与第二特征提取模块204的输出相连,包含第二解码器,用于基于第二组特征数据输出去阴影图像。通过两阶段级联神经网络,可以增强阴影的去除效果。在一种可选的实施例中,第一级网络和第二级网络除了输入的通道数不同外,具有相同的结构,例如,可以基于经典分割网络UNet构建。In an optional embodiment, as shown in FIG. 2, the neural network includes a two-stage cascaded first-level network 20 and a second-level network 22, and the first-level network includes a first feature extraction module 200 and a shaded area The estimation module 202 , the second-level network includes a second feature extraction module 204 and a result map output module 206 . Wherein, the first feature extraction module 200 includes a first encoder for extracting the features of the image to be processed layer by layer to obtain the first set of feature data; the shadow area estimation module 202 is connected to the output of the first feature extraction module 200, Including a first decoder for estimating the shadow area based on the first set of feature data and outputting a mask map for the shadow area; the second feature extraction module 204 includes a second encoder connected to the output of the first-level network, receiving the While processing the image, receive the shaded area mask map output by the first-level network to obtain the second set of feature data; the result map output module 206 is connected to the output of the second feature extraction module 204, includes a second decoder, and uses and outputting a deshaded image based on the second set of feature data. Through a two-stage cascaded neural network, shadow removal can be enhanced. In an optional embodiment, the first-level network and the second-level network have the same structure except for the number of input channels. For example, they can be constructed based on the classic segmentation network UNet.
两个编码器各层的输出通过跨层连接分别与两个解码器对应层的输出沿着通道轴进行拼接。在编码器和解码器的跨层连接上添加多尺度金字塔池化模块。多尺度金字塔池化模块包括多个不同核尺寸的池化层、卷积层和插值上采样层,首先通过池化层提取不同尺度的特征,然后通过卷积层提取低级和/或高级特征,再通过插值上采样层将编码器和解码器对应层的输出调整成相同的尺寸,最后沿着通道轴拼合成一个特征。由于阴影的影响程度和面积在不同的图像中有很大的差异,因此,阴影区域的判定既要参考局部的纹理特征,也要考虑到全局的语义信息。多尺度金字塔池化模块将不同尺度的特征进行了融合,增强了网络的泛化性,使得网络在不同面积和程度的阴影图上都能取得较好的效果。The outputs of each layer of the two encoders are respectively concatenated with the outputs of the corresponding layers of the two decoders along the channel axis through cross-layer connections. Add a multi-scale pyramid pooling module on the cross-layer connection of encoder and decoder. The multi-scale pyramid pooling module includes multiple pooling layers of different kernel sizes, convolutional layers, and interpolation upsampling layers. First, features of different scales are extracted through the pooling layer, and then low-level and/or high-level features are extracted through the convolutional layer. Then the output of the corresponding layer of the encoder and decoder is adjusted to the same size through the interpolation upsampling layer, and finally stitched into a feature along the channel axis. Since the influence degree and area of shadows are very different in different images, the determination of shadow areas should not only refer to local texture features, but also consider global semantic information. The multi-scale pyramid pooling module integrates features of different scales, which enhances the generalization of the network and enables the network to achieve better results on shadow maps of different areas and degrees.
为了提高模型在设备上的运行速度,可以对模型进行剪枝,将编码器中的卷积层替换为 分组卷积,每个卷积核只对一个通道进行卷积,从而减少模型的运算量,提高处理速度。In order to improve the running speed of the model on the device, the model can be pruned, and the convolutional layer in the encoder is replaced by grouped convolution. Each convolution kernel only convolves one channel, thereby reducing the amount of calculation of the model. , to increase processing speed.
为了更好地抑制协方差漂移,增强网络对数据的拟合能力,在编码器和解码器的卷积层后添加实例正则化层对特征进行正则化,从而提高阴影的去除效果。In order to better suppress covariance drift and enhance the network's ability to fit data, an instance regularization layer is added after the convolutional layers of the encoder and decoder to regularize the features, thereby improving the shadow removal effect.
当待处理图像的图像分辨率较高或数据量较大的时候,将待处理图像直接送入经过训练的神经网络会导致显存溢出或者导致处理时间过长影响用户体验,为了解决这个问题,可采用常规的插值缩放算法,但容易导致图像信息的损失,使得生成的图像无法完美地放大成原图。When the image resolution of the image to be processed is high or the amount of data is large, sending the image to be processed directly into the trained neural network will cause memory overflow or cause the processing time to be too long and affect the user experience. In order to solve this problem, you can The conventional interpolation scaling algorithm is used, but it is easy to cause the loss of image information, so that the generated image cannot be perfectly enlarged into the original image.
考虑到阴影区域通常没有显著的梯度信息这一特点,在一个可选的实施例中,可以采用图像金字塔算法先对待处理图像进行降采样,并且在降采样的同时保存各级图层的梯度信息形成拉普拉斯金字塔,然后将金字塔尺寸最小的图层送入经过训练的神经网络,获得输出图像;最后,使用拉普拉斯金字塔对输出图像进行重建,由于阴影区域的梯度信息很弱,因此,重建过程即使会将待处理图像的一些梯度信息复原,但也不会对去阴影效果产生影响。利用降采样时保存的各级图层的梯度信息进行图像重建,从而实现在不影响图像分辨率的前提下消除阴影。通过引入降采样和图像重建,一方面使得图像处理的速度得到了保证,另一方面不会影响图像处理前后的质量,有利于在手机端等算力不高的装置中处理高分辨率图像。Considering that the shadow area usually has no significant gradient information, in an optional embodiment, the image pyramid algorithm can be used to downsample the image to be processed first, and the gradient information of all levels of layers can be saved while downsampling Form the Laplacian pyramid, and then send the layer with the smallest pyramid size to the trained neural network to obtain the output image; finally, use the Laplacian pyramid to reconstruct the output image, because the gradient information in the shadow area is weak, Therefore, even if the reconstruction process restores some gradient information of the image to be processed, it will not affect the shadow removal effect. The image is reconstructed by using the gradient information of all levels of layers saved during downsampling, so as to eliminate shadows without affecting the image resolution. By introducing down-sampling and image reconstruction, on the one hand, the speed of image processing is guaranteed, and on the other hand, the quality before and after image processing will not be affected, which is conducive to processing high-resolution images in devices with low computing power such as mobile phones.
如图3所示,为了获得经过训练的神经网络,该图像处理方法还包括:As shown in Figure 3, in order to obtain a trained neural network, the image processing method also includes:
S300:构建初始神经网络;S300: Construct an initial neural network;
S302:使用样本数据对初始神经网络进行训练,获得经过训练的神经网络,其中,样本数据包括实拍图和合成阴影图,合成阴影图由纯阴影图和无阴影图合成。S302: Using sample data to train the initial neural network to obtain a trained neural network, wherein the sample data includes a real-shot image and a synthetic shadow image, and the synthetic shadow image is synthesized from a pure shadow image and a no-shadow image.
由于用户常拍摄的图像中阴影种类非常丰富,从阴影的边缘来区分,包括光源离背景距离较近时拍摄出的清晰锐利的阴影边缘,以及光源离背景距离较远时拍摄出的模糊的、过渡平缓的阴影边缘;除此以外,当光源呈现不同的颜色时(例如偏红黄色的暖色光和偏蓝的冷色光和日光),阴影也会出现不同的颜色。因此,考虑到这些特点,用于训练初始神经网络的样本数据在整个图像处理方法中起着至关重要的作用,样本数据的获取主要有两种方法:实景采集和图像合成。Since there are many types of shadows in the images that users often take, they can be distinguished from the edges of the shadows, including clear and sharp shadow edges when the light source is close to the background, and blurred and sharp shadow edges when the light source is far away from the background. Shadow edges with smooth transitions; in addition, when the light source presents different colors (such as reddish yellow warm light and bluish cool light and sunlight), the shadow will also appear different colors. Therefore, considering these characteristics, the sample data used to train the initial neural network plays a vital role in the whole image processing method, and there are mainly two methods for obtaining sample data: real scene acquisition and image synthesis.
在采用实景采集的方法中,采集人员按照场景类别(例如,不同的光照场景,暖光、冷光、日光等)选择对应的光线环境和拍摄对象,将手机或相机等拍摄装置用三脚架固定,调整合适的光照方向和焦距,使用手掌、手机或其它常见物体作为遮挡物进行遮光,在拍摄对象上形成阴影并进行拍摄得到阴影图,然后撤去遮挡物再次拍摄得到无阴影的背景图,这样就得到成对的样本数据。In the method of real scene acquisition, the acquisition personnel select the corresponding light environment and shooting objects according to the scene category (for example, different lighting scenes, warm light, cold light, daylight, etc.), fix the mobile phone or camera and other shooting devices with a tripod, adjust Appropriate light direction and focal length, using palms, mobile phones or other common objects as occluders for shading, forming shadows on the subject and shooting to obtain a shadow image, and then removing the occluder and shooting again to obtain a shadow-free background image, thus obtaining paired sample data.
但是,实景采集通常难以保证样本数据具有较高的质量,一方面由于遮挡产生的光线变 化,背景图和阴影图在非阴影区域会产生亮度和色彩的差异,同时阴影图难以和背景图完全对齐;另一方面由于光线变化或者焦点变化,阴影图和背景图中会产生噪声,这些都会对网络的训练产生较大的影响。However, it is usually difficult to ensure the high quality of sample data in real scene acquisition. On the one hand, due to the light changes caused by occlusion, the background image and shadow image will have differences in brightness and color in non-shaded areas, and it is difficult to completely align the shadow image with the background image. ; On the other hand, due to light changes or focus changes, noise will be generated in shadow images and background images, which will have a greater impact on network training.
对此,可以使用图像合成方法生成逼真的合成阴影图用于神经网络的训练。In this regard, image synthesis methods can be used to generate realistic synthetic shadow maps for the training of neural networks.
在一个可选的实施例中,图像合成方法包括:In an optional embodiment, the image synthesis method includes:
S400:获取纯阴影图;S400: Obtain a pure shadow map;
在一种可选的实施例中,数据采集人员在预设光线环境下,在桌面上平铺一张白纸,使用手掌、手机或其它常见物体进行遮光,在白纸上留下纯阴影图S,其中,纯阴影图S的全部或部分区域为阴影区域;In an optional embodiment, the data collector lays a piece of white paper on the desktop under the preset light environment, uses palms, mobile phones or other common objects to block the light, and leaves a pure shadow image on the white paper S, where all or part of the pure shadow map S is a shadow area;
由于在获取纯阴影图时,白纸上的非阴影区域可能不会显示为纯白色,导致非阴影区域与阴影区域的边界不够明显。因此,在另一种可选的实施例中,还可以对纯阴影图进行变换,例如,S'=min(a,S/mean(S)*a),其中,a为正整数。通过上述变换,可以将经过变换的纯阴影图中非阴影区域的像素值统一设置为一个固定数值a(例如255),阴影区域的像素值则为0~a之间的数值,使得纯阴影图中非阴影区域与阴影区域之间具有较为清晰的边界。Since the non-shaded area on the white paper may not appear pure white when obtaining a pure shadow map, the boundary between the non-shaded area and the shaded area is not obvious enough. Therefore, in another optional embodiment, the pure shadow map can also be transformed, for example, S'=min(a,S/mean(S)*a), where a is a positive integer. Through the above transformation, the pixel values of the non-shaded areas in the transformed pure shadow map can be uniformly set to a fixed value a (for example, 255), and the pixel values of the shadow areas are values between 0 and a, so that the pure shadow map There is a clearer boundary between the shaded area of Central Africa and the shaded area.
S402:获取无阴影图;S402: Obtain a shadow-free image;
在一种可选的实施例中,数据采集人员在上述相同光线环境下拍摄各类拍摄对象的无阴影图B;In an optional embodiment, the data collectors take the shadow-free images B of various objects in the above-mentioned same light environment;
S404:基于纯阴影图和无阴影图,获得合成阴影图;S404: Obtain a composite shadow image based on the pure shadow image and the no shadow image;
在一种可选的实施例中,将纯阴影图S(或经过变换的纯阴影图S')与无阴影图B逐像素相乘,得到合成阴影图。In an optional embodiment, the pure shadow map S (or transformed pure shadow map S′) is multiplied pixel by pixel by the non-shade map B to obtain a composite shadow map.
这种图像合成方法考虑到阴影对光线的削弱作用,可以较好的处理边缘过渡平缓的阴影,具有较强的真实感。This image synthesis method takes into account the weakening effect of shadows on light, and can better handle shadows with gentle edge transitions, and has a strong sense of reality.
由于样本数据为包含实拍图和合成阴影图的混合数据,初始神经网络还包括对样本数据进行类别判断的模块,当判断出输入初始神经网络的样本数据为实拍图时,标注数据(Ground Truth,GT)为实景采集的去阴影图像,由于实拍图的阴影区域掩模图不可调整,因此,可以根据初始神经网络输出的去阴影图像和作为标注数据GT的去阴影图像之间的差异调整第二级网络内部22的参数;当判断出输入初始神经网络的样本数据为合成阴影图时,标注数据(Ground Truth,GT)包括实景采集的无阴影图像和纯阴影图,根据阴影区域掩模图和纯阴影图之间的差异调整第一级网络20内部的参数,根据初始神经网络输出的去阴影图像和作为标注数据的无阴影图之间的差异调整第二级网络22内部的参数。通过使用混合数据作为样本 数据进行训练,对于过渡平缓的阴影而言,能够获取其准确的掩膜,保证掩膜分割的质量,提高阴影消除的效果。Since the sample data is a mixture of real-shot images and synthetic shadow images, the initial neural network also includes a module for classifying the sample data. Truth, GT) is the shadow removal image collected in the real scene. Since the shadow area mask map of the real shot image cannot be adjusted, it can be based on the difference between the shadow removal image output by the initial neural network and the shadow removal image as the labeled data GT. Adjust the parameters of the second-level network internal 22; when it is judged that the sample data input to the initial neural network is a synthetic shadow image, the label data (Ground Truth, GT) includes the unshaded image and the pure shadow image collected in real scene, according to the shadow area mask The difference between the model image and the pure shadow image adjusts the parameters inside the first-level network 20, and adjusts the parameters inside the second-level network 22 according to the difference between the shadow-removed image output by the initial neural network and the unshaded image as labeled data . By using mixed data as sample data for training, for shadows with gentle transitions, accurate masks can be obtained, the quality of mask segmentation can be guaranteed, and the effect of shadow elimination can be improved.
在一个可选的实施例中,样本数据的获取方法还可以包括对已经获取的样本数据进行随机翻转、旋转、色温调节、通道交换、添加随机噪声等一项或多项处理,使得样本数据更为丰富,增加网络的鲁棒性。In an optional embodiment, the sample data acquisition method may also include one or more processes such as random flipping, rotation, color temperature adjustment, channel exchange, and adding random noise to the acquired sample data, so that the sample data is more accurate. For enrichment, increase the robustness of the network.
在一个可选的实施例中,在对初始神经网络进行监督训练时,损失函数包含以下至少一项:像素损失、特征损失、结构相似性损失和对抗损失。In an optional embodiment, when performing supervised training on the initial neural network, the loss function includes at least one of the following: pixel loss, feature loss, structural similarity loss, and adversarial loss.
像素损失函数是从图像的像素层面衡量两图相似性的函数,主要有图像像素值损失和梯度损失。在本实施例中,主要指初始神经网络的输出图像和标签图像对比的像素值均方误差和两图梯度的L1范数误差的加权和。像素损失从像素层面监督训练过程,使初始神经网络的输出图像和标签图像的每个像素的像素值尽量接近。为了引导初始神经网络将注意力集中在阴影层和背景层在阴影区域的差异性而非全图的噪声,在一个可选的实施例中,可以引入像素截断损失,对像素损失进行截断,即当两个像素的绝对差值大于给定阈值时,才计算两个像素的损失,否则忽略两个像素的差异。添加像素截断损失后,能够引导网络关注阴影区域,抑制图像的噪声,不仅去阴影的效果有所增强,同时网络的收敛速度也大大加快。The pixel loss function is a function to measure the similarity of two images from the pixel level of the image, mainly including image pixel value loss and gradient loss. In this embodiment, it mainly refers to the weighted sum of the pixel value mean square error of the comparison between the output image of the initial neural network and the label image and the L1 norm error of the gradient of the two images. The pixel loss supervises the training process from the pixel level, so that the pixel value of each pixel of the output image of the initial neural network and the label image is as close as possible. In order to guide the initial neural network to focus on the difference between the shadow layer and the background layer in the shadow area rather than the noise of the whole image, in an optional embodiment, a pixel truncation loss can be introduced to truncate the pixel loss, namely When the absolute difference between two pixels is greater than a given threshold, the loss of two pixels is calculated, otherwise the difference between two pixels is ignored. After adding the pixel truncation loss, it can guide the network to pay attention to the shadow area and suppress the noise of the image. Not only the effect of shadow removal is enhanced, but the convergence speed of the network is also greatly accelerated.
特征损失主要指初始神经网络的输入图像和标签图像对应特征的L1范数误差的加权和。在一种可选的实施例中,采用在ImageNet数据集上预训练的VGG19网络作为特征提取器,将初始神经网络的输出图像和标签图像分别送入该特征提取器,获取VGG19各层的特征然后计算输入图像和标签图像对应特征的L1范数误差并加权求和。VGG19各层的特征对图像的细节和噪声不敏感,具有较好的语义特性,因此即使输入图像和输出图像存在噪声或者不对齐等缺陷,特征损失依然能够准确地生成有效的阴影区域的差异,弥补了像素损失对噪声敏感的不足,具有很好的稳定性。The feature loss mainly refers to the weighted sum of the L1 norm error of the input image of the initial neural network and the corresponding features of the label image. In an optional embodiment, the VGG19 network pre-trained on the ImageNet data set is used as a feature extractor, and the output image and label image of the initial neural network are respectively sent to the feature extractor to obtain the features of each layer of VGG19 Then calculate the L1 norm error of the corresponding features of the input image and the label image and weight the summation. The features of each layer of VGG19 are not sensitive to image details and noise, and have good semantic characteristics. Therefore, even if the input image and output image have defects such as noise or misalignment, feature loss can still accurately generate effective differences in shadow areas. It makes up for the lack of sensitivity of pixel loss to noise and has good stability.
结构相似性损失函数是根据图像的全局特征衡量两图相似性的函数。在本实施例中,主要指初始神经网络的输出图像和标签图像在全局上的亮度与对比度差异,添加该损失函数可以有效抑制网络输出的偏色,提高图像的整体质量。The structural similarity loss function is a function to measure the similarity of two images according to the global features of the images. In this embodiment, it mainly refers to the global difference in brightness and contrast between the output image of the initial neural network and the label image. Adding this loss function can effectively suppress the color cast of the network output and improve the overall quality of the image.
对抗损失主要是指判别器的输出结果和输出图像的真实类别的损失值。在训练的后期,初始神经网络的输出图像与标签图像的差异变得较小时,像素损失、特征损失、结构相似性损失的效果会逐渐变小,网络收敛变慢。此时同步训练一个判别器网络用于辅助网络的训练。首先将初始神经网络的输出图像和标签图像送入判别器,判别器对输出图像是否是标签图像进行判定,根据判别器的输出结果和输出图像的真实类别计算损失并更新判别器参数;随后将判别器对输出图像的判别结果作为输出图像的真实程度的损失,用该损失更新判别器的参数。当判别器无法区分初始神经网络的输出图像和标签图像时,表明训练结束。对抗损失可 以有效消除网络处理引起的图像副作用(例如,阴影与非阴影区域颜色不一致的问题,阴影残留问题等),提高网络输出图像的真实程度。Adversarial loss mainly refers to the output of the discriminator and the loss value of the true category of the output image. In the later stage of training, when the difference between the output image of the initial neural network and the label image becomes smaller, the effects of pixel loss, feature loss, and structural similarity loss will gradually become smaller, and the network convergence will slow down. At this time, a discriminator network is trained synchronously for the training of the auxiliary network. First, the output image and label image of the initial neural network are sent to the discriminator, and the discriminator judges whether the output image is a label image, calculates the loss and updates the discriminator parameters according to the output result of the discriminator and the true category of the output image; then The discrimination result of the discriminator on the output image is taken as the loss of the authenticity of the output image, and the parameters of the discriminator are updated with the loss. Training ends when the discriminator cannot distinguish between the output image of the initial neural network and the label image. The adversarial loss can effectively eliminate the image side effects caused by network processing (for example, the problem of color inconsistency between shadow and non-shadow area, shadow residual problem, etc.), and improve the realism of the network output image.
阈值截断损失。由于光照的影响,实景采集的成对数据在非阴影区域也可能出现轻微的亮度差异和颜色变化,而这些差异是用户可以接受的,无需处理。因此在训练过程中,为了防止网络的注意力集中在这些全局的微小差异上,该方法引入阈值截断损失,即仅当网络的输出和GT之间的差异大于给定阈值时才将该差异汇总计入总体损失计算参数的梯度,否则认为损失是0。该损失函数容忍了网络的输出与GT之间存在的微小差异,将网络学习的重心转移到差异较大的区域,从而有效提高了网络对较为明显的阴影的消除能力。Threshold truncation loss. Due to the influence of lighting, the paired data collected in the real scene may also have slight brightness differences and color changes in non-shaded areas, and these differences are acceptable to users and do not need to be processed. Therefore, in the training process, in order to prevent the network's attention from focusing on these global small differences, the method introduces a threshold truncation loss, that is, only when the difference between the output of the network and GT is greater than a given threshold, the difference is aggregated. Include the gradient of the overall loss calculation parameters, otherwise the loss is considered to be 0. This loss function tolerates the slight difference between the output of the network and GT, and shifts the focus of network learning to areas with large differences, thus effectively improving the network's ability to eliminate obvious shadows.
阴影边缘损失。首先,对阴影区域掩模图做膨胀处理,获得膨胀图;其次,对阴影区域掩模图做腐蚀处理,获得腐蚀图;然后,获取膨胀图和腐蚀图的差集作为获得阴影和非阴影的边界区域,并使用TVLoss进行平滑,可以有效的过渡阴影和非阴影区域。Shadow edge loss. First, expand the mask image of the shadow area to obtain an expansion map; secondly, perform erosion processing on the mask image of the shadow area to obtain a corrosion map; then, obtain the difference between the expansion map and the corrosion map as the result of obtaining shadow and non-shadow Boundary areas, and smoothed with TVLoss, can effectively transition between shadow and non-shadow areas.
阴影亮度损失,使得神经网络输出的去阴影图中与阴影区域对应的区域的亮度与输入的待处理图像中的阴影区域的亮度差值大于0,用于提升去阴影图像中与阴影区域对应的区域的亮度。Shadow brightness loss, so that the brightness difference between the brightness of the area corresponding to the shadow area in the shadow removal map output by the neural network and the shadow area in the input image to be processed is greater than 0, which is used to improve the shadow area corresponding to the shadow area in the shadow removal image. The brightness of the area.
在一个可选的实施例中,初始神经网络的背景层输出模块使用上述所有损失的加权和作为总损失,同时采用Wassertein生成对抗网络作为对抗损失。In an optional embodiment, the output module of the background layer of the initial neural network uses the weighted sum of all the above losses as the total loss, and uses the Wassertein generation confrontation network as the confrontation loss.
该网络结构提取了输入图像的全局特征和局部特征,提高阴影的消除程度,同时保护非阴影区域不出现副作用。The network structure extracts the global features and local features of the input image, improves the degree of shadow elimination, and protects non-shadow areas from side effects.
图5(a)和图5(b)是采用本申请实施例的图像处理方法实现的处理效果对比图,其中,图5(a)是包含阴影的待处理图像,图5(b)是经过图像处理方法处理后的去阴影图像,由两幅图对比可以看出,本申请提供的图像处理方法,能够在有效消除阴影的同时不对背景层产生显著的副作用。Fig. 5(a) and Fig. 5(b) are the comparison diagrams of the processing effects realized by the image processing method of the embodiment of the present application, wherein Fig. 5(a) is an image to be processed containing shadows, and Fig. 5(b) is the processed image after The shadow-removed image processed by the image processing method can be seen from the comparison of the two images. The image processing method provided by this application can effectively eliminate the shadow without causing significant side effects on the background layer.
本申请实施例采用的神经网络结构和损失函数也可以应用在去除阴影、去雨去雾等应用场景中,主要用于处理手机等移动终端拍摄的高分辨率图像,但是同样适用于PC端或其他嵌入式设备中处理各种分辨率的图像。The neural network structure and loss function used in the embodiment of the present application can also be applied in application scenarios such as removing shadows, removing rain and fog, and is mainly used to process high-resolution images taken by mobile terminals such as mobile phones, but it is also applicable to PC or Handle images of various resolutions in other embedded devices.
根据本申请实施例的另一方面,还提供了一种电子设备,包括:处理器;以及存储器,用于存储处理器的可执行指令;其中,处理器配置为经由执行可执行指令来执行上述任意一项的图像处理方法。According to another aspect of the embodiments of the present application, there is also provided an electronic device, including: a processor; and a memory for storing executable instructions of the processor; wherein, the processor is configured to execute the above-mentioned Any image processing method.
根据本申请实施例的另一方面,还提供了一种存储介质,存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行上述任意一项的图像处理方法。According to another aspect of the embodiments of the present application, there is also provided a storage medium, the storage medium includes a stored program, wherein when the program is running, the device where the storage medium is located is controlled to execute any one of the above image processing methods.
根据本申请实施例的另一方面,还提供了一种图像处理装置。参考图6,是根据本申请 实施例的一种可选的图像处理装置的结构框图。如图6所示,图像处理装置60包括图像采集单元600、处理单元602。According to another aspect of the embodiments of the present application, an image processing device is also provided. Referring to Fig. 6, it is a structural block diagram of an optional image processing device according to an embodiment of the present application. As shown in FIG. 6 , the image processing device 60 includes an image acquisition unit 600 and a processing unit 602 .
下面对图像处理装置60包含的各个单元进行具体描述。Each unit included in the image processing device 60 will be specifically described below.
图像采集单元600,用于获取包含阴影区域的待处理图像。The image acquisition unit 600 is configured to acquire the image to be processed including the shaded area.
处理单元602,用于接收待处理图像,并使用经过训练的神经网络对待处理图像进行处理,获得去阴影图像,其中,神经网络包含两级级联的第一级网络和第二级网络,待处理图像和第一级网络的输出图像同时输入至第二级网络。The processing unit 602 is configured to receive an image to be processed, and use a trained neural network to process the image to be processed to obtain a shadow-removed image, wherein the neural network includes a two-stage cascaded first-level network and a second-level network, to be The processed image and the output image of the first-level network are simultaneously input to the second-level network.
在一种可选的实施例中,神经网络的结构如图2所示及本文相关描述,在此不再展开赘述。In an optional embodiment, the structure of the neural network is shown in FIG. 2 and related descriptions herein, and will not be repeated here.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present application, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be realized in other ways. Wherein, the device embodiments described above are only illustrative. For example, the division of the units may be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for enabling a computer device (which may be a personal computer, server or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disc, etc., which can store program codes. .
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above description is only the preferred embodiment of the present application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present application, some improvements and modifications can also be made. These improvements and modifications are also It should be regarded as the protection scope of this application.

Claims (18)

  1. 一种图像处理方法,包括:An image processing method, comprising:
    获取包含阴影区域的待处理图像;Get the image to be processed that contains the shaded area;
    将所述待处理图像输入至经过训练的神经网络,获得去阴影图像;其中,所述神经网络包含两级级联的第一级网络和第二级网络,所述第一级网络接收所述待处理图像并输出阴影区域掩模图,所述第二级网络同时接收所述待处理图像和所述阴影区域掩模图,并输出所述去阴影图像。The image to be processed is input to a trained neural network to obtain a shadow-removed image; wherein, the neural network includes a two-stage cascaded first-level network and a second-level network, and the first-level network receives the The to-be-processed image outputs a shadow area mask map, and the second-level network simultaneously receives the to-be-processed image and the shadow area mask map, and outputs the shadow-removed image.
  2. 根据权利要求1所述的图像处理方法,其特征在于,所述第一级网络包括:The image processing method according to claim 1, wherein the first-level network comprises:
    第一特征提取模块,包含第一编码器,用于逐层提取所述待处理图像的特征,获得第一组特征数据;The first feature extraction module includes a first encoder for extracting features of the image to be processed layer by layer to obtain a first set of feature data;
    阴影区域估计模块,与所述第一特征提取模块的输出连接,包含第一解码器,用于基于所述第一组特征数据估计阴影区域并输出阴影区域掩模图。The shadow area estimation module is connected to the output of the first feature extraction module and includes a first decoder for estimating a shadow area based on the first set of feature data and outputting a shadow area mask map.
  3. 根据权利要求1所述的图像处理方法,其特征在于,所述第二级网络包括:The image processing method according to claim 1, wherein the second-level network comprises:
    第二特征提取模块,包含第二编码器,与所述第一级网络的输出连接,在接收待处理图像的同时接收所述第一级网络输出的阴影区域掩模图,用于获得第二组特征数据;The second feature extraction module, including a second encoder, is connected to the output of the first-level network, receives the shadow area mask map output by the first-level network while receiving the image to be processed, and is used to obtain the second group feature data;
    结果图输出模块,与所述第二特征提取模块的输出相连,包含第二解码器,用于基于所述第二组特征数据输出所述去阴影图像。The result image output module is connected to the output of the second feature extraction module, and includes a second decoder, configured to output the shadow-removed image based on the second set of feature data.
  4. 根据权利要求2或3所述的图像处理方法,其特征在于,所述第一解码器或所述第二解码器各层的输出通过跨层连接与所述第一编码器或所述第二编码器对应层的输出沿着通道轴进行拼接,在所述第一解码器或所述第二解码器以及所述第一编码器或所述第二编码器的跨层连接上添加多尺度金字塔池化模块,所述多尺度金字塔池化模块将不同尺度的特征进行融合。The image processing method according to claim 2 or 3, wherein the output of each layer of the first decoder or the second decoder is connected to the first encoder or the second decoder through a cross-layer connection. The output of the corresponding layer of the encoder is concatenated along the channel axis, and a multi-scale pyramid is added on the cross-layer connection of the first decoder or the second decoder and the first encoder or the second encoder A pooling module, the multi-scale pyramid pooling module fuses features of different scales.
  5. 根据权利要求1所述的图像处理方法,其特征在于,在获取包含阴影区域的待处理图像之后,所述图像处理方法还包括:The image processing method according to claim 1, characterized in that, after obtaining the image to be processed including the shadow area, the image processing method further comprises:
    采用图像金字塔算法对所述待处理图像进行降采样,并且在降采样的同时保存各级图层的梯度信息形成拉普拉斯金字塔;Using an image pyramid algorithm to down-sample the image to be processed, and save the gradient information of all levels of layers while down-sampling to form a Laplacian pyramid;
    将尺寸最小的图层送入经过训练的神经网络,获得输出图像;Feed the layer with the smallest size into the trained neural network to obtain the output image;
    使用拉普拉斯金字塔对所述输出图像进行低分辨率到高分辨率的重建,获得所述去 阴影图像。Use the Laplacian pyramid to carry out the reconstruction from low resolution to high resolution to the output image to obtain the shadow removal image.
  6. 根据权利要求1所述的图像处理方法,还包括:The image processing method according to claim 1, further comprising:
    构建初始神经网络;Build the initial neural network;
    使用样本数据对所述初始神经网络进行训练,获得所述经过训练的神经网络,其中,所述样本数据包括实拍图和合成阴影图,所述合成阴影图使用图像合成方法用纯阴影图和无阴影图合成。Using sample data to train the initial neural network to obtain the trained neural network, wherein the sample data includes a real shot image and a synthetic shadow image, and the synthetic shadow image uses an image synthesis method using a pure shadow image and No shadow map compositing.
  7. 根据权利要求1所述的图像处理方法,其特征在于,使用图像合成方法用纯阴影图和无阴影图合成所述合成阴影图包括:The image processing method according to claim 1, wherein using an image synthesis method to synthesize the composite shadow image with a pure shadow image and a shadow-free image comprises:
    获取纯阴影图;Get a pure shadow map;
    获取无阴影图;Get the unshaded map;
    基于所述纯阴影图和所述无阴影图,获得所述合成阴影图。Based on the pure shadow map and the no shadow map, the composite shadow map is obtained.
  8. 根据权利要求7所述的图像处理方法,其特征在于,使用图像合成方法用纯阴影图和无阴影图合成所述合成阴影图还包括:对所述纯阴影图进行变换,基于经过变换的纯阴影图与所述无阴影图,获得所述合成阴影图,其中,所述经过变换的纯阴影图中非阴影区域的像素值统一设置为一个固定数值a,阴影区域的像素值则为0~a之间的数值,a为正整数。The image processing method according to claim 7, wherein using an image synthesis method to synthesize the composite shadow image with a pure shadow image and a non-shadow image further comprises: transforming the pure shadow image, based on the transformed pure shadow image The shadow image and the unshaded image are used to obtain the composite shadow image, wherein the pixel values in the non-shaded area of the transformed pure shadow image are uniformly set to a fixed value a, and the pixel values in the shadow area are 0- A value between a, a is a positive integer.
  9. 根据权利要求7所述的图像处理方法,其特征在于,所述初始神经网络还包括对样本数据进行类别判断的模块,当判断出输入所述初始神经网络的样本数据为实拍图时,标注数据为实景采集的去阴影图像,根据所述初始神经网络输出的所述去阴影图像和作为所述标注数据的所述去阴影图像之间的差异调整所述第二级网络内部的参数;当判断出输入所述初始神经网络的样本数据为合成阴影图时,所述标注数据包括实景采集的所述无阴影图像和所述纯阴影图,根据所述阴影区域掩模图和所述纯阴影图之间的差异调整第一级网络内部的参数,根据所述初始神经网络输出的去阴影图像和所述无阴影图像之间的差异调整第二级网络内部的参数。The image processing method according to claim 7, wherein the initial neural network further comprises a module for classifying sample data, and when it is judged that the sample data input into the initial neural network is a real shot, mark The data is a shadow-removed image collected in the real scene, and the parameters inside the second-level network are adjusted according to the difference between the shadow-removed image output by the initial neural network and the shadow-removed image as the label data; when When it is judged that the sample data input to the initial neural network is a synthetic shadow image, the label data includes the shadow-free image and the pure shadow image collected in real scene, according to the shadow area mask image and the pure shadow The difference between the graphs adjusts the internal parameters of the first-level network, and adjusts the internal parameters of the second-level network according to the difference between the shadow-removed image output by the initial neural network and the shadow-free image.
  10. 根据权利要求6所述的图像处理方法,其特征在于,使用样本数据对所述初始神经网络进行训练时,损失函数包含以下至少一项:像素损失、特征损失、结构相似性损失、对抗损失、阴影边缘损失、阴影亮度损失。The image processing method according to claim 6, wherein when using sample data to train the initial neural network, the loss function includes at least one of the following: pixel loss, feature loss, structural similarity loss, adversarial loss, Shadow edge loss, shadow brightness loss.
  11. 根据权利要求10所述的图像处理方法,其特征在于,所述像素损失包含像素截断损失,当所述初始神经网络的输出图像和标签图像中对应的两个像素的绝对差值大于给定阈值时,计算所述两个像素的损失;当所述初始神经网络的输出图像和所述标签图像中对应 的两个像素的绝对差值不大于所述给定阈值时,忽略所述两个像素的差异。The image processing method according to claim 10, wherein the pixel loss includes a pixel truncation loss, when the absolute difference between the two corresponding pixels in the output image of the initial neural network and the label image is greater than a given threshold , calculate the loss of the two pixels; when the absolute difference between the output image of the initial neural network and the corresponding two pixels in the label image is not greater than the given threshold, ignore the two pixels difference.
  12. 根据权利要求10所述的图像处理方法,其特征在于,所述阴影亮度损失,使得所述神经网络输出的所述去阴影图中与所述阴影区域对应的区域的亮度与输入的所述待处理图像中的所述阴影区域的亮度差值大于0,用于提升所述去阴影图像中与所述阴影区域对应的区域的亮度。The image processing method according to claim 10, wherein the shadow brightness loss is such that the brightness of the region corresponding to the shadow region in the shadow removal map output by the neural network is the same as the brightness of the region to be input. The brightness difference of the shadow area in the processed image is greater than 0, and is used to increase the brightness of the area corresponding to the shadow area in the shadow removal image.
  13. 根据权利要求10所述的图像处理方法,其特征在于,当所述损失函数包括所述阴影边缘损失时,所述图像处理方法包括:对所述阴影区域掩模图做膨胀处理,获得膨胀图;对所述阴影区域掩模图做腐蚀处理,获得腐蚀图;获取所述膨胀图和所述腐蚀图的差集作为阴影和非阴影的边界区域,并使用TVLoss进行平滑。The image processing method according to claim 10, wherein when the loss function includes the shadow edge loss, the image processing method comprises: performing dilation processing on the shadow area mask map to obtain an dilation map ; Perform erosion processing on the mask image of the shadow area to obtain an erosion image; obtain the difference set of the expansion image and the erosion image as the boundary area of shadow and non-shading, and use TVLoss to perform smoothing.
  14. 一种图像处理装置,包括:An image processing device, comprising:
    图像采集单元,用于获取包含阴影区域的待处理图像;An image acquisition unit, configured to acquire an image to be processed that includes a shaded area;
    处理单元,用于接收待处理图像,并使用经过训练的神经网络对待处理图像进行处理,获得去阴影图像;其中,所述神经网络包含两级级联的第一级网络和第二级网络,所述第一级网络接收所述待处理图像并输出阴影区域掩模图,所述第二级网络同时接收所述待处理图像和所述阴影区域掩模图,并输出所述去阴影图像。The processing unit is configured to receive the image to be processed, and process the image to be processed using a trained neural network to obtain a shadow-removed image; wherein, the neural network includes a two-stage cascaded first-level network and a second-level network, The first-level network receives the image to be processed and outputs a shadow area mask map, and the second-level network simultaneously receives the image to be processed and the shadow area mask map, and outputs the shadow-removed image.
  15. 根据权利要求14所述的图像处理装置,其特征在于,所述第一级网络包括:The image processing device according to claim 14, wherein the first-level network comprises:
    第一特征提取模块,包含第一编码器,用于逐层提取所述待处理图像的特征,获得第一组特征数据;The first feature extraction module includes a first encoder for extracting features of the image to be processed layer by layer to obtain a first set of feature data;
    阴影区域估计模块,与所述第一特征提取模块的输出连接,包含第一解码器,用于基于所述第一组特征数据估计阴影区域并输出阴影区域掩模图。The shadow area estimation module is connected to the output of the first feature extraction module and includes a first decoder for estimating a shadow area based on the first set of feature data and outputting a shadow area mask map.
  16. 根据权利要求14所述的图像处理装置,其特征在于,所述第二级网络包括:The image processing device according to claim 14, wherein the second level network comprises:
    第二特征提取模块,包含第二编码器,与所述第一级网络的输出连接,在接收待处理图像的同时接收所述第一级网络输出的阴影区域掩模图,用于获得第二组特征数据;The second feature extraction module, including a second encoder, is connected to the output of the first-level network, receives the shadow area mask map output by the first-level network while receiving the image to be processed, and is used to obtain the second group feature data;
    结果图输出模块,与所述第二特征提取模块的输出相连,包含第二解码器,用于基于所述第二组特征数据输出去阴影图像。The result image output module is connected to the output of the second feature extraction module, and includes a second decoder, configured to output a shadow-removed image based on the second set of feature data.
  17. 一种存储介质,其特征在于,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行权利要求1至13中任意一项所述的图像处理方法。A storage medium, characterized in that the storage medium includes a stored program, wherein when the program is running, the device where the storage medium is located is controlled to execute the image processing method according to any one of claims 1 to 13.
  18. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器;以及processor; and
    存储器,用于存储所述处理器的可执行指令;a memory for storing executable instructions of the processor;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至13中任意一项所述的图像处理方法。Wherein, the processor is configured to execute the image processing method described in any one of claims 1 to 13 by executing the executable instructions.
PCT/CN2022/125573 2021-10-18 2022-10-17 Image processing method and apparatus, and storage medium and electronic device WO2023066173A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020247015956A KR20240089729A (en) 2021-10-18 2022-10-17 Image processing methods, devices, storage media and electronic devices

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111210502.3 2021-10-18
CN202111210502.3A CN116012232A (en) 2021-10-18 2021-10-18 Image processing method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2023066173A1 true WO2023066173A1 (en) 2023-04-27

Family

ID=86019717

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125573 WO2023066173A1 (en) 2021-10-18 2022-10-17 Image processing method and apparatus, and storage medium and electronic device

Country Status (3)

Country Link
KR (1) KR20240089729A (en)
CN (1) CN116012232A (en)
WO (1) WO2023066173A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310276A (en) * 2023-05-24 2023-06-23 泉州装备制造研究所 Target detection method, target detection device, electronic equipment and storage medium
CN117726550A (en) * 2024-02-18 2024-03-19 成都信息工程大学 Multi-scale gating attention remote sensing image defogging method and system
CN118521577A (en) * 2024-07-22 2024-08-20 中建四局安装工程有限公司 Control method and related equipment for intelligent production line of threaded connection type fire-fighting pipeline

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575976B (en) * 2024-01-12 2024-04-19 腾讯科技(深圳)有限公司 Image shadow processing method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180012101A1 (en) * 2016-07-08 2018-01-11 Xerox Corporation Shadow detection and removal in license plate images
CN111626951A (en) * 2020-05-20 2020-09-04 武汉科技大学 Image shadow elimination method based on content perception information
CN112819720A (en) * 2021-02-02 2021-05-18 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN112991329A (en) * 2021-04-16 2021-06-18 浙江指云信息技术有限公司 Image shadow detection and elimination method based on GAN
CN113222845A (en) * 2021-05-17 2021-08-06 东南大学 Portrait external shadow removing method based on convolution neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180012101A1 (en) * 2016-07-08 2018-01-11 Xerox Corporation Shadow detection and removal in license plate images
CN111626951A (en) * 2020-05-20 2020-09-04 武汉科技大学 Image shadow elimination method based on content perception information
CN112819720A (en) * 2021-02-02 2021-05-18 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN112991329A (en) * 2021-04-16 2021-06-18 浙江指云信息技术有限公司 Image shadow detection and elimination method based on GAN
CN113222845A (en) * 2021-05-17 2021-08-06 东南大学 Portrait external shadow removing method based on convolution neural network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310276A (en) * 2023-05-24 2023-06-23 泉州装备制造研究所 Target detection method, target detection device, electronic equipment and storage medium
CN116310276B (en) * 2023-05-24 2023-08-08 泉州装备制造研究所 Target detection method, target detection device, electronic equipment and storage medium
CN117726550A (en) * 2024-02-18 2024-03-19 成都信息工程大学 Multi-scale gating attention remote sensing image defogging method and system
CN117726550B (en) * 2024-02-18 2024-04-30 成都信息工程大学 Multi-scale gating attention remote sensing image defogging method and system
CN118521577A (en) * 2024-07-22 2024-08-20 中建四局安装工程有限公司 Control method and related equipment for intelligent production line of threaded connection type fire-fighting pipeline
CN118521577B (en) * 2024-07-22 2024-10-18 中建四局安装工程有限公司 Control method and related equipment for intelligent production line of threaded connection type fire-fighting pipeline

Also Published As

Publication number Publication date
KR20240089729A (en) 2024-06-20
CN116012232A (en) 2023-04-25

Similar Documents

Publication Publication Date Title
WO2023066173A1 (en) Image processing method and apparatus, and storage medium and electronic device
WO2022110638A1 (en) Human image restoration method and apparatus, electronic device, storage medium and program product
Zhang et al. Multi-scale single image dehazing using perceptual pyramid deep network
Wan et al. CoRRN: Cooperative reflection removal network
CN108932693B (en) Face editing and completing method and device based on face geometric information
WO2021103137A1 (en) Indoor scene illumination estimation model, method and device, and storage medium and rendering method
Xie et al. Joint super resolution and denoising from a single depth image
Li et al. Single image snow removal via composition generative adversarial networks
CN111626951B (en) Image shadow elimination method based on content perception information
WO2023212997A1 (en) Knowledge distillation based neural network training method, device, and storage medium
CN115641391A (en) Infrared image colorizing method based on dense residual error and double-flow attention
WO2023284401A1 (en) Image beautification processing method and apparatus, storage medium, and electronic device
CN114723760B (en) Portrait segmentation model training method and device and portrait segmentation method and device
CN109829925B (en) Method for extracting clean foreground in matting task and model training method
Liu et al. PD-GAN: perceptual-details gan for extremely noisy low light image enhancement
Guo et al. Deep illumination-enhanced face super-resolution network for low-light images
KR102628115B1 (en) Image processing method, device, storage medium, and electronic device
Zhao et al. Detecting deepfake video by learning two-level features with two-stream convolutional neural network
CN111553856A (en) Image defogging method based on depth estimation assistance
Xiao et al. Image hazing algorithm based on generative adversarial networks
Gao et al. Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization.
CN113781324A (en) Old photo repairing method
CN116934972A (en) Three-dimensional human body reconstruction method based on double-flow network
WO2023066099A1 (en) Matting processing
CN117350928A (en) Application of object-aware style transfer to digital images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22882779

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2024523517

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20247015956

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22882779

Country of ref document: EP

Kind code of ref document: A1