[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (17)

Search Parameters:
Keywords = face image inpainting

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 4962 KiB  
Article
Face Image Inpainting of Tang Dynasty Female Terracotta Figurines Based on an Improved Global and Local Consistency Image Completion Algorithm
by Qiangqiang Fan, Cong Wei, Shangyang Wu and Jinhan Xie
Appl. Sci. 2024, 14(24), 11621; https://doi.org/10.3390/app142411621 (registering DOI) - 12 Dec 2024
Viewed by 656
Abstract
Tang Dynasty female terracotta figurines, as important relics of ceramics art, have commonly suffered from natural and man-made damages, among which facial damage is severe. Image inpainting is widely used in cultural heritage fields such as murals and paintings, where rich datasets are [...] Read more.
Tang Dynasty female terracotta figurines, as important relics of ceramics art, have commonly suffered from natural and man-made damages, among which facial damage is severe. Image inpainting is widely used in cultural heritage fields such as murals and paintings, where rich datasets are available. However, its application in the restoration of Tang Dynasty terracotta figurines remains limited. This study first evaluates the extent of facial damage in Tang Dynasty female terracotta figurines, and then uses the Global and Local Consistency Image Completion (GLCIC) algorithm to restore the original appearance of female terracotta figurines, ensuring that the restored area is globally and locally consistent with the original image. To address the issues of scarce data and blurred facial features of the figurines, the study optimized the algorithm through data augmentation, guided filtering, and local enhancement techniques. The experimental results show that the improved algorithm has higher accuracy in restoring the shape features of the female figurines’ faces, but there is still room for improvement in terms of color and texture features. This study provides a new technical path for the protection and inpainting of Tang Dynasty terracotta figurines, and proposes an effective strategy for image inpainting with data scarcity. Full article
(This article belongs to the Special Issue Advanced Technologies in Cultural Heritage)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Painted terracotta female figurine; (<b>b</b>) standing terracotta female figurine; (<b>c</b>) seated terracotta female figurine with a tricolor phoenix crown and bird (Source: the Palace Museum).</p>
Full article ">Figure 2
<p>Tang Dynasty male-dressed female terracotta figurine (Source: Shaanxi History Museum).</p>
Full article ">Figure 3
<p>Research flowchart.</p>
Full article ">Figure 4
<p>(<b>a</b>) Tricolor hand-in-hand standing female figurine; (<b>b</b>) painted terracotta standing female figurine; (<b>c</b>) tricolor horse-riding figurine; (<b>d</b>) tricolor hand-in-hand standing female figurine (Source: the Palace Museum).</p>
Full article ">Figure 5
<p>(<b>a</b>) Original face image; (<b>b</b>) face occlusion image; (<b>c</b>) Tang female terracotta figurine face image processed by the original GLCIC algorithm.</p>
Full article ">Figure 6
<p>(<b>a</b>) Original face image; (<b>b</b>) face occlusion image; (<b>c</b>) Tang female terracotta figurine face image processed by the original GLCIC algorithm.</p>
Full article ">Figure 7
<p>Algorithm architecture diagram.</p>
Full article ">Figure 8
<p>Weight distribution for Tang Dynasty female terracotta figurine facial features.</p>
Full article ">Figure 9
<p>Distribution of facial deficiency degrees in Tang Dynasty terracotta samples.</p>
Full article ">Figure 10
<p>Tang Dynasty female terracotta figurine face original images: (<b>a</b>) standing female terracotta figurine; (<b>b</b>) tricolor female terracotta figurine; (<b>c</b>) tricolor standing female terracotta figurine (<b>d</b>) Standing female terracotta figurine; (<b>e</b>) painted female terracotta figurine.</p>
Full article ">Figure 11
<p>Tang Dynasty female terracotta figurine face occlusion images: (<b>a</b>) standing female terracotta figurine; (<b>b</b>) tricolor female terracotta figurine; (<b>c</b>) tricolor standing female terracotta figurine; (<b>d</b>) standing female terracotta figurine; (<b>e</b>) painted female terracotta figurine.</p>
Full article ">Figure 12
<p>Tang female terracotta figurine face images processed by the original GLCIC algorithm: (<b>a</b>) standing female terracotta figurine; (<b>b</b>) tricolor female terracotta figurine; (<b>c</b>) tricolor standing female terracotta figurine; (<b>d</b>) standing female terracotta figurine; (<b>e</b>) painted female terracotta figurine.</p>
Full article ">Figure 13
<p>Tang female terracotta figurine face images processed by the improved GLCIC algorithm for Tang female terracotta figurines: (<b>a</b>) standing female terracotta figurine; (<b>b</b>) tricolor female terracotta figurine; (<b>c</b>) tricolor standing female terracotta figurine; (<b>d</b>) standing female terracotta figurine; (<b>e</b>) painted female terracotta figurine.</p>
Full article ">Figure 14
<p>(<b>a</b>) Painted terracotta standing female terracotta figurine; (<b>b</b>) painted terracotta standing female terracotta figurine occlusion image; (<b>c</b>) Tang female terracotta figurine face image processed by the improved GLCIC algorithm for Tang female terracotta figurines.</p>
Full article ">
20 pages, 3670 KiB  
Article
Enhancing Visual Odometry with Estimated Scene Depth: Leveraging RGB-D Data with Deep Learning
by Aleksander Kostusiak and Piotr Skrzypczyński
Electronics 2024, 13(14), 2755; https://doi.org/10.3390/electronics13142755 - 13 Jul 2024
Viewed by 1199
Abstract
Advances in visual odometry (VO) systems have benefited from the widespread use of affordable RGB-D cameras, improving indoor localization and mapping accuracy. However, older sensors like the Kinect v1 face challenges due to depth inaccuracies and incomplete data. This study compares indoor VO [...] Read more.
Advances in visual odometry (VO) systems have benefited from the widespread use of affordable RGB-D cameras, improving indoor localization and mapping accuracy. However, older sensors like the Kinect v1 face challenges due to depth inaccuracies and incomplete data. This study compares indoor VO systems that use RGB-D images, exploring methods to enhance depth information. We examine conventional image inpainting techniques and a deep learning approach, utilizing newer depth data from devices like the Kinect v2. Our research highlights the importance of refining data from lower-quality sensors, which is crucial for cost-effective VO applications. By integrating deep learning models with richer context from RGB images and more comprehensive depth references, we demonstrate improved trajectory estimation compared to standard methods. This work advances budget-friendly RGB-D VO systems for indoor mobile robots, emphasizing deep learning’s role in leveraging connections between image appearance and depth data. Full article
(This article belongs to the Special Issue Applications of Machine Vision in Robotics)
Show Figures

Figure 1

Figure 1
<p>Block scheme of the simple VO system used in this research.</p>
Full article ">Figure 2
<p>PUTKK dataset examples: Visualization of RGB-colored point clouds registered with the Kinect v1 and ground-truth camera poses for one of the PUTKK dataset sequences (<b>a</b>), sample Kinect v1 depth frame (<b>b</b>), and sample Kinect v2 depth frame (<b>c</b>) from this dataset.</p>
Full article ">Figure 3
<p>Block scheme of the Particle Swarm Optimization algorithm.</p>
Full article ">Figure 4
<p>Block scheme of the adaptive Evolutionary Algorithm.</p>
Full article ">Figure 5
<p>The U-Net architecture of a CNN is based on the Monodepth network, which is used for depth completion with RGB or RGB-D input from Kinect v1.</p>
Full article ">Figure 6
<p>Block scheme of the fine-tuning procedure for the Monodepth model.</p>
Full article ">Figure 7
<p>Search method for the best learning rate with FastAi (<b>a</b>) and learning results (<b>b</b>).</p>
Full article ">Figure 8
<p>Depth maps: (<b>a</b>) Original, (<b>b</b>) Navier–Stokes (NS), (<b>c</b>) Telea, and (<b>d</b>) learned with RGB-D frames.</p>
Full article ">Figure 9
<p>Kinect v1 depth maps: (<b>a</b>) Original, (<b>b</b>) estimated by the original Monodepth model, and (<b>c</b>) original depth image completed by learned depth with RGB inference.</p>
Full article ">Figure 10
<p>Colormap visualization of the difference between the estimated scene depth and the Kinect v2 ground-truth for the improved Monodepth model inference with Kinect v1 RGB frames only (<b>a</b>) and with both RGB and depth frames from Kinect v1 (<b>b</b>). See text for further explanation.</p>
Full article ">Figure 11
<p>Trajectory estimation results for the <span class="html-italic">putkk_Dataset_1_Kin_1</span> sequence for our VO system working with: (<b>a</b>,<b>e</b>) no inpainting, (<b>b</b>,<b>f</b>) NS inpainting, (<b>c</b>,<b>g</b>) Telea inpainting, or (<b>d</b>,<b>h</b>) learned depth with RGB-D inference. First row: ATE error plots; second: translational RPE plots.</p>
Full article ">Figure 12
<p>Trajectory estimation results for the <span class="html-italic">putkk_Dataset_2_Kin_1</span> sequence for our VO system working with: (<b>a</b>,<b>e</b>) no inpainting, (<b>b</b>,<b>f</b>) NS inpainting, (<b>c</b>,<b>g</b>) Telea inpainting, or (<b>d</b>,<b>h</b>) learned depth with RGB-D inference. First row: ATE error plots; second: translational RPE plots.</p>
Full article ">Figure 13
<p>Trajectory estimation results for the <span class="html-italic">putkk_Dataset_3_Kin_1</span> sequence for our VO system working with: (<b>a</b>,<b>e</b>) no inpainting, (<b>b</b>,<b>f</b>) NS inpainting, (<b>c</b>,<b>g</b>) Telea inpainting, or (<b>d</b>,<b>h</b>) learned depth with RGB-D inference. First row: ATE error plots; second: translational RPE plots.</p>
Full article ">Figure 14
<p>Trajectory estimation results for Kinect v1 frames on the <span class="html-italic">putkk_Dataset_2_Kin_1</span> sequence for VO system working with (<b>a</b>,<b>d</b>) no inpainting, (<b>b</b>,<b>e</b>) learned depth completion with RGB inference, and (<b>c</b>,<b>f</b>) learned depth completion with RGB-D inference. The first row presents ATE error plots. The second includes translational RPE plots.</p>
Full article ">Figure 15
<p>Trajectory estimation results for Kinect v1 frames on the <span class="html-italic">putkk_Dataset_3_Kin_1</span> sequence for VO system working with (<b>a</b>,<b>d</b>) no inpainting, (<b>b</b>,<b>e</b>) learned depth completion with RGB inference, and (<b>c</b>,<b>f</b>) learned depth completion with RGB-D inference. The first row presents ATE error plots. The second includes translational RPE plots.</p>
Full article ">
23 pages, 9314 KiB  
Article
MAM-E: Mammographic Synthetic Image Generation with Diffusion Models
by Ricardo Montoya-del-Angel, Karla Sam-Millan, Joan C. Vilanova and Robert Martí
Sensors 2024, 24(7), 2076; https://doi.org/10.3390/s24072076 - 24 Mar 2024
Cited by 2 | Viewed by 3014
Abstract
Generative models are used as an alternative data augmentation technique to alleviate the data scarcity problem faced in the medical imaging field. Diffusion models have gathered special attention due to their innovative generation approach, the high quality of the generated images, and their [...] Read more.
Generative models are used as an alternative data augmentation technique to alleviate the data scarcity problem faced in the medical imaging field. Diffusion models have gathered special attention due to their innovative generation approach, the high quality of the generated images, and their relatively less complex training process compared with Generative Adversarial Networks. Still, the implementation of such models in the medical domain remains at an early stage. In this work, we propose exploring the use of diffusion models for the generation of high-quality, full-field digital mammograms using state-of-the-art conditional diffusion pipelines. Additionally, we propose using stable diffusion models for the inpainting of synthetic mass-like lesions on healthy mammograms. We introduce MAM-E, a pipeline of generative models for high-quality mammography synthesis controlled by a text prompt and capable of generating synthetic mass-like lesions on specific regions of the breast. Finally, we provide quantitative and qualitative assessment of the generated images and easy-to-use graphical user interfaces for mammography synthesis. Full article
(This article belongs to the Special Issue Image Analysis and Biomedical Sensors)
Show Figures

Figure 1

Figure 1
<p>Graphical user interface of <span class="html-italic">MAM-E</span> for generation of synthetic healthy mammograms.</p>
Full article ">Figure 2
<p>Resizing and cropping of an OMI-H mammogram. The same process was conducted for VinDr mammograms.</p>
Full article ">Figure 3
<p>Examples of training mammograms (real) and their respective text prompts for OMI-H (<b>a</b>,<b>b</b>) and VinDr (<b>c</b>,<b>d</b>).</p>
Full article ">Figure 4
<p>Forward and reverse diffusion process.</p>
Full article ">Figure 5
<p>Linear and scaled beta schedulers (<b>left</b>) and their effects on the mean (blue) and variance (orange) of the noise sampling distributions (<b>right</b>).</p>
Full article ">Figure 6
<p>Reverse diffusion process using a denoising UNet. The upblock layers are a mirror of the downblock layers.</p>
Full article ">Figure 7
<p>Classifier-free guidance geometrical interpretation. As the guidance scale increases, the image is pushed further in the prompt direction.</p>
Full article ">Figure 8
<p>Overall MAM-E pipeline combining both full-field generation and lesion inpainting tasks. In dark green, the inputs needed for a full synthetic mammogram generation with lesion. In light green, the optional input for lesion inpainting on real images, instead of full-field synthetic images. In red, the outputs of each task.</p>
Full article ">Figure 9
<p>Example of the latent space representation of an image on the top and the original and reconstructed images on the bottom.</p>
Full article ">Figure 10
<p>Denoising UNet architecture used for the reverse diffusion process. The upblock structure is a mirror of the downblock.</p>
Full article ">Figure 11
<p>Inpainting training pipeline. The mask is reshaped to match the image size of the latent representations (64 × 64). The same UNet as in the Stable Diffusion pipeline is used.</p>
Full article ">Figure 12
<p>Training evolution of the diffusion process on an unconditional pretrained model at epochs 1, 3, 6, and 10.</p>
Full article ">Figure 13
<p>Training evolution of SD with Hologic images at epochs 1, 3, 6, and 10. The prompt is: “a mammogram in MLO view with small area”.</p>
Full article ">Figure 14
<p>Training evolution of the diffusion process on a conditional pretrained model trained with Siemens images at epochs 1, 3, 6, and 10. The prompt is: “a mammogram in CC view with high density”.</p>
Full article ">Figure 15
<p>Training evolution of the diffusion process on a conditional pretrained model trained with both Siemens and Hologic images at epochs 1, 3, 7, and 40. The prompt is: “a siemens mammogram in MLO view with high density and small area”.</p>
Full article ">Figure 16
<p>Guidance effect on the generation output. From upper-left to lower-right, the guidance varies in a range from 1 to 4. Prompt: “A siemens mammogram in MLO view with small area and very high density”.</p>
Full article ">Figure 17
<p>Receiver Operating Characteristic curve of radiological assessment.</p>
Full article ">Figure 18
<p>Explainability AI method heatmaps of synthetic lesion inpainted on real healthy mammograms.</p>
Full article ">Figure 19
<p>MAM-E lesion drawing tool.</p>
Full article ">Figure 20
<p>Examples of unsuccessful image generation of the combined dataset models coming from the same text prompt. The prompt was “A siemens mammogram in MLO view with small area and very high density” with a guidance scale of 4.</p>
Full article ">
19 pages, 44295 KiB  
Article
A U-Net Architecture for Inpainting Lightstage Normal Maps
by Hancheng Zuo and Bernard Tiddeman
Computers 2024, 13(2), 56; https://doi.org/10.3390/computers13020056 - 19 Feb 2024
Viewed by 2066
Abstract
In this paper, we investigate the inpainting of normal maps that were captured from a lightstage. Occlusion of parts of the face during performance capture can be caused by the movement of, e.g., arms, hair, or props. Inpainting is the process of interpolating [...] Read more.
In this paper, we investigate the inpainting of normal maps that were captured from a lightstage. Occlusion of parts of the face during performance capture can be caused by the movement of, e.g., arms, hair, or props. Inpainting is the process of interpolating missing areas of an image with plausible data. We build on previous works about general image inpainting that use generative adversarial networks (GANs). We extend our previous work on normal map inpainting to use a U-Net structured generator network. Our method takes into account the nature of the normal map data and so requires modification of the loss function. We use a cosine loss rather than the more common mean squared error loss when training the generator. Due to the small amount of training data available, even when using synthetic datasets, we require significant augmentation, which also needs to take account of the particular nature of the input data. Image flipping and inplane rotations need to properly flip and rotate the normal vectors. During training, we monitor key performance metrics including the average loss, structural similarity index measure (SSIM), and peak signal-to-noise ratio (PSNR) of the generator, alongside the average loss and accuracy of the discriminator. Our analysis reveals that the proposed model generates high-quality, realistic inpainted normal maps, demonstrating the potential for application to performance capture. The results of this investigation provide a baseline on which future researchers can build with more advanced networks and comparison with inpainting of the source images used to generate the normal maps. Full article
(This article belongs to the Special Issue Selected Papers from Computer Graphics & Visual Computing (CGVC 2023))
Show Figures

Figure 1

Figure 1
<p>Examples of augmentations, rows from left to right: original; flipped; random zoom and rotation; flipped with (different) random zoom and rotation. The colours show the red, green and blue (RGB) values stored, which are mapped on to (x,y,z) directions.</p>
Full article ">Figure 2
<p>A schematic of a U-Net-based generative adversarial network (GAN) with skip connections, showing the input and output image dimensions as 256 × 256 pixels and detailing the loss functions for the generator and discriminator.</p>
Full article ">Figure 3
<p>A schematic of a bow-tie-like-based generative adversarial network (GAN), showing the input and output image dimensions as 256 × 256 pixels and detailing the loss functions for the generator and discriminator.</p>
Full article ">Figure 4
<p>Performance comparison of U-Net and bow-tie-like model structures across three mask types. Each set of two rows represents a different mask type, with U-Net results in the top row and bow-tie-like results in the bottom row of each set. From top to bottom, the first set is for the irregular lines mask, the second set is for the scattered smaller blobs mask, and the third set is for the single large blob mask. Within each row, from left to right, the images compare the raw image, masked image, predicted image, and predicted image in the masked region only.</p>
Full article ">Figure 5
<p>Performance comparison of U-Net and bow-tie-like model structures across two types of masks, particularly relevant to real-life cases of missing or damaged regions in face images. Each set of two rows represents a different mask type, with U-Net results in the top row and bow-tie-like results in the bottom row of each set. From top to bottom, the first set is for the rotating large stripe mask, and the second set is for the edge crescent mask. Within each row, from left to right, the images compare the raw image, masked image, predicted image, and predicted image in the masked region only.</p>
Full article ">Figure 6
<p>During the initial epochs of testing on the irregular lines mask, the U-Net model (<b>left</b>) and the bow-tie-like model (<b>right</b>) exhibit distinct performance characteristics. The generator loss, indicated in green, and the SSIM, represented in purple, evolve differently across the epochs for each model.</p>
Full article ">Figure 7
<p>Comparative performance of three training configurations of a GAN. The top row depicts results when only the generator is trained. The middle row shows outcomes for simultaneous training of both the generator and discriminator. The bottom row presents results from cyclic training, where the generator is trained to convergence first, followed by the discriminator.</p>
Full article ">Figure 8
<p>The figure presents three sets of training configurations for GANs: the top row for ‘Generator-only’ training, the middle for ‘Simultaneous Generator and Discriminator’, and the bottom for ‘Generator to Convergence, then Discriminator’, each displaying generator loss (in green) and image quality metrics SSIM/PSNR (in purple).</p>
Full article ">Figure 9
<p>Visual results of generator loss experiments with varying <math display="inline"><semantics> <msub> <mi>λ</mi> <mrow> <mi>r</mi> <mi>e</mi> <mi>c</mi> <mi>o</mi> <mi>n</mi> </mrow> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>λ</mi> <mrow> <mi>a</mi> <mi>d</mi> <mi>v</mi> </mrow> </msub> </semantics></math> ratios. The top row corresponds to <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mrow> <mi>r</mi> <mi>e</mi> <mi>c</mi> <mi>o</mi> <mi>n</mi> </mrow> </msub> <mo>=</mo> <mn>999</mn> <mo>,</mo> <msub> <mi>λ</mi> <mrow> <mi>a</mi> <mi>d</mi> <mi>v</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, the middle row to <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mrow> <mi>r</mi> <mi>e</mi> <mi>c</mi> <mi>o</mi> <mi>n</mi> </mrow> </msub> <mo>=</mo> <mn>100</mn> <mo>,</mo> <msub> <mi>λ</mi> <mrow> <mi>a</mi> <mi>d</mi> <mi>v</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, and the bottom row to <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mrow> <mi>r</mi> <mi>e</mi> <mi>c</mi> <mi>o</mi> <mi>n</mi> </mrow> </msub> <mo>=</mo> <mn>10</mn> <mo>,</mo> <msub> <mi>λ</mi> <mrow> <mi>a</mi> <mi>d</mi> <mi>v</mi> </mrow> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 10
<p>From top to bottom, the performance is shown with and without an irregular lines mask used as input for training the generator; from left to right, images compare the raw image, masked image, predicted image, and predicted image in the masked region only.</p>
Full article ">
21 pages, 5608 KiB  
Article
Low-Cost Training of Image-to-Image Diffusion Models with Incremental Learning and Task/Domain Adaptation
by Hector Antona, Beatriz Otero and Ruben Tous
Electronics 2024, 13(4), 722; https://doi.org/10.3390/electronics13040722 - 10 Feb 2024
Viewed by 2299
Abstract
Diffusion models specialized in image-to-image translation tasks, like inpainting and colorization, have outperformed the state of the art, yet their computational requirements are exceptionally demanding. This study analyzes different strategies to train image-to-image diffusion models in a low-resource setting. The studied strategies include [...] Read more.
Diffusion models specialized in image-to-image translation tasks, like inpainting and colorization, have outperformed the state of the art, yet their computational requirements are exceptionally demanding. This study analyzes different strategies to train image-to-image diffusion models in a low-resource setting. The studied strategies include incremental learning and task/domain transfer learning. First, a base model for human face inpainting is trained from scratch with an incremental learning strategy. The resulting model achieves an FID score almost equivalent to that of its batch learning equivalent while significantly reducing the training time. Second, the base model is fine-tuned to perform a different task, image colorization, and, in a different domain, landscape images. The resulting colorization models showcase exceptional performances with a minimal number of training epochs. We examine the impact of different configurations and provide insights into the ability of image-to-image diffusion models for transfer learning across tasks and domains. Full article
(This article belongs to the Section Electronic Multimedia)
Show Figures

Figure 1

Figure 1
<p>Diffusion model architecture from [<a href="#B4-electronics-13-00722" class="html-bibr">4</a>], transforming data into a simple noise distribution. Then, this process can be reversed by utilizing the score of the distribution at each intermediate timestep.</p>
Full article ">Figure 2
<p>U-Net architecture (extracted from [<a href="#B6-electronics-13-00722" class="html-bibr">6</a>]). Each blue box represents a feature map with multiple channels, and the number of channels is indicated on top of the box. The x–y size of the feature map is provided at the lower left edge of the box. White boxes indicate copied feature maps, while the arrows indicate the various operations performed.</p>
Full article ">Figure 3
<p>Conditional (image-to-image) diffusion model inference workflow.</p>
Full article ">Figure 4
<p>Outline of the methodology workflow.</p>
Full article ">Figure 5
<p>Masked image for inpainting and ground truth image.</p>
Full article ">Figure 6
<p>Gray level (3 channels) and ground truth image.</p>
Full article ">Figure 7
<p>Gray level and ground truth mountain image.</p>
Full article ">Figure 8
<p>Gray level and ground truth forest image.</p>
Full article ">Figure 9
<p>Process of inpainting through denoising.</p>
Full article ">Figure 10
<p>Qualitative comparison for inpainting in the CelebA-HQ dataset. The ground-truth image is on the left, and the generated is on the right. In the center is the masked image.</p>
Full article ">Figure 11
<p>More qualitative comparison for inpainting in the CelebA-HQ dataset. The ground-truth image is on the left, and the generated is on the right for each pair.</p>
Full article ">Figure 12
<p>Qualitative comparison for colorization in the CelebA-HQ dataset. The ground-truth image is on the left, and the generated is on the right. In the center is the masked image in a gray-level scale.</p>
Full article ">Figure 13
<p>Qualitative comparison for colorization in the Places2 dataset. The ground-truth image is on the left, and the generated is on the right. In the center is the masked image in a gray-level scale.</p>
Full article ">Figure 14
<p>Qualitative comparison for colorization in the Places-2 dataset. The ground-truth image is on the left, and the generated is on the right for each pair.</p>
Full article ">
17 pages, 8039 KiB  
Article
A Realistic Hand Image Composition Method for Palmprint ROI Embedding Attack
by Licheng Yan, Lu Leng, Andrew Beng Jin Teoh and Cheonshik Kim
Appl. Sci. 2024, 14(4), 1369; https://doi.org/10.3390/app14041369 - 7 Feb 2024
Cited by 3 | Viewed by 1343
Abstract
Palmprint recognition (PPR) has recently garnered attention due to its robustness and accuracy. Many PPR methods rely on preprocessing the region of interest (ROI). However, the emergence of ROI attacks capable of generating synthetic ROI images poses a significant threat to PPR systems. [...] Read more.
Palmprint recognition (PPR) has recently garnered attention due to its robustness and accuracy. Many PPR methods rely on preprocessing the region of interest (ROI). However, the emergence of ROI attacks capable of generating synthetic ROI images poses a significant threat to PPR systems. Despite this, ROI attacks are less practical since PPR systems typically take hand images as input rather than just the ROI. Therefore, there is a pressing need for a method that specifically targets the system by composing hand images. The intuitive approach involves embedding an ROI into a hand image, a comparatively simpler process requiring less data than generating entirely synthetic images. However, embedding faces challenges, as the composited hand image must maintain a consistent color and texture. To overcome these challenges, we propose a training-free, end-to-end hand image composition method incorporating ROI harmonization and palm blending. The ROI harmonization process iteratively adjusts the ROI to seamlessly integrate with the hand using a modified style transfer method. Simultaneously, palm blending employs a pretrained inpainting model to composite a hand image with a continuous transition. Our results demonstrate that the proposed method achieves a high attack performance on the IITD and Tongji datasets, with the composited hand images exhibiting realistic visual quality. Full article
(This article belongs to the Special Issue Multimedia Systems Studies)
Show Figures

Figure 1

Figure 1
<p>Hand attack, ROI attack, and palmprint recognition flow.</p>
Full article ">Figure 2
<p>The illustration of low and high appearance consistency in ROI embedding. The second row depicts a seamless integration of carrier and ROI images.</p>
Full article ">Figure 3
<p>The comparison of image composition [<a href="#B21-applsci-14-01369" class="html-bibr">21</a>] and hand composition.</p>
Full article ">Figure 4
<p>The pipeline of ROI embedding.</p>
Full article ">Figure 5
<p>The pretrained CAST with modified loss functions is utilized to realize ROI harmonization.</p>
Full article ">Figure 6
<p>The pipeline of making inpainting masks.</p>
Full article ">Figure 7
<p>The illustration of Palm Blending.</p>
Full article ">Figure 8
<p>The dataset samples of Tongji, IITD, Gao and Zhou.</p>
Full article ">Figure 9
<p>Cross-dataset harmonized images result.</p>
Full article ">Figure 10
<p>Harmonized images in different iterations.</p>
Full article ">Figure 11
<p>The L1, L2, Lpips, and texture loss in different iterations.</p>
Full article ">Figure 12
<p>The images of ablation study of cycle losses.</p>
Full article ">Figure 13
<p>The distribution of ROI embedding attack in four datasets (ROI<b>2</b>Hand).</p>
Full article ">Figure 14
<p>The composited hand images of four datasets.</p>
Full article ">
12 pages, 2446 KiB  
Article
Recovery-Based Occluded Face Recognition by Identity-Guided Inpainting
by Honglei Li, Yifan Zhang, Wenmin Wang, Shenyong Zhang and Shixiong Zhang
Sensors 2024, 24(2), 394; https://doi.org/10.3390/s24020394 - 9 Jan 2024
Cited by 1 | Viewed by 2119
Abstract
Occlusion in facial photos poses a significant challenge for machine detection and recognition. Consequently, occluded face recognition for camera-captured images has emerged as a prominent and widely discussed topic in computer vision. The present standard face recognition methods have achieved remarkable performance in [...] Read more.
Occlusion in facial photos poses a significant challenge for machine detection and recognition. Consequently, occluded face recognition for camera-captured images has emerged as a prominent and widely discussed topic in computer vision. The present standard face recognition methods have achieved remarkable performance in unoccluded face recognition but performed poorly when directly applied to occluded face datasets. The main reason lies in the absence of identity cues caused by occlusions. Therefore, a direct idea of recovering the occluded areas through an inpainting model has been proposed. However, existing inpainting models based on an encoder-decoder structure are limited in preserving inherent identity information. To solve the problem, we propose ID-Inpainter, an identity-guided face inpainting model, which preserves the identity information to the greatest extent through a more accurate identity sampling strategy and a GAN-like fusing network. We conduct recognition experiments on the occluded face photographs from the LFW, CFP-FP, and AgeDB-30 datasets, and the results indicate that our method achieves state-of-the-art performance in identity-preserving inpainting, and dramatically improves the accuracy of normal recognizers in occluded face recognition. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

Figure 1
<p>Encoder-decoder-structured identity-preserving inpainting networks with different identity training loss. <math display="inline"><semantics> <mi mathvariant="bold-italic">C</mi> </semantics></math> is an encoder-decoder-structured content inpainting network, and <math display="inline"><semantics> <mi mathvariant="bold-italic">R</mi> </semantics></math> is a pretrained recognizer. <math display="inline"><semantics> <msub> <mi>f</mi> <mrow> <mi>i</mi> <mi>d</mi> </mrow> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>f</mi> <mi>o</mi> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>f</mi> <mi>r</mi> </msub> </semantics></math> are identity-centered features, occlusion-recovered features, and real face features, respectively.</p>
Full article ">Figure 2
<p>We get the recovered result closer to the ground truth by sampling from a closer distribution, which is learned with an unoccluded dataset.</p>
Full article ">Figure 3
<p>The overall pipeline of our approach. It is divided into verification and training phases. The verification phase consists of two modules: ID-Inpainter <span class="html-italic"><b>I</b></span> and recognizer <span class="html-italic"><b>R</b></span>. ID-Inpainter <span class="html-italic"><b>I</b></span> consists of three sub-networks, i.e., content inpainter <span class="html-italic"><b>C</b></span>, identity sampler <span class="html-italic"><b>S</b></span>, and identity fusor <span class="html-italic"><b>F</b></span>. In the training phase, ground truth faces <math display="inline"><semantics> <msub> <mi>X</mi> <mi>g</mi> </msub> </semantics></math>, occlusion masks <span class="html-italic">M</span>, and reference images <math display="inline"><semantics> <msub> <mi>X</mi> <mi>s</mi> </msub> </semantics></math> are put into <span class="html-italic"><b>I</b></span> to train an identity-guided inpainting model. In the verification phase, the masked face is used as the reference face to implement identity-preserving inpainting. Finally, the inpainted result is recognized by a normal recognizer <span class="html-italic"><b>R</b></span>.</p>
Full article ">Figure 4
<p>The structure of <span class="html-italic">k</span>-th AIFB. Each block consists of an ID-fusion path and a reconstruction path.</p>
Full article ">Figure 5
<p>Inpainting results generated by different models. In each row, from left to right, they are the masked face, inpainting result by PIC [<a href="#B9-sensors-24-00394" class="html-bibr">9</a>], CA [<a href="#B8-sensors-24-00394" class="html-bibr">8</a>], CA with cosine identity loss (CA-cos), and CA with central-diversity loss [<a href="#B15-sensors-24-00394" class="html-bibr">15</a>] (CA-div), ID-Inpainter on PIC (PIC-F), ID-Inpainter on CA (CA-F), and the ground truth (GT).</p>
Full article ">Figure 6
<p>Inpainting results from different modulation modules.</p>
Full article ">Figure 7
<p>Visualization of feature distributions by converting <math display="inline"><semantics> <mrow> <mn>256</mn> <mi>D</mi> </mrow> </semantics></math> to <math display="inline"><semantics> <mrow> <mn>2</mn> <mi>D</mi> </mrow> </semantics></math> with t-SNE [<a href="#B36-sensors-24-00394" class="html-bibr">36</a>] and following normalization. Different markers with color represent different classes. Zoomed in for better view.</p>
Full article ">
22 pages, 8887 KiB  
Article
GANMasker: A Two-Stage Generative Adversarial Network for High-Quality Face Mask Removal
by Mohamed Mahmoud and Hyun-Soo Kang
Sensors 2023, 23(16), 7094; https://doi.org/10.3390/s23167094 - 10 Aug 2023
Cited by 9 | Viewed by 2590
Abstract
Deep-learning-based image inpainting methods have made remarkable advancements, particularly in object removal tasks. The removal of face masks has gained significant attention, especially in the wake of the COVID-19 pandemic, and while numerous methods have successfully addressed the removal of small objects, removing [...] Read more.
Deep-learning-based image inpainting methods have made remarkable advancements, particularly in object removal tasks. The removal of face masks has gained significant attention, especially in the wake of the COVID-19 pandemic, and while numerous methods have successfully addressed the removal of small objects, removing large and complex masks from faces remains demanding. This paper presents a novel two-stage network for unmasking faces considering the intricate facial features typically concealed by masks, such as noses, mouths, and chins. Additionally, the scarcity of paired datasets comprising masked and unmasked face images poses an additional challenge. In the first stage of our proposed model, we employ an autoencoder-based network for binary segmentation of the face mask. Subsequently, in the second stage, we introduce a generative adversarial network (GAN)-based network enhanced with attention and Masked–Unmasked Region Fusion (MURF) mechanisms to focus on the masked region. Our network generates realistic and accurate unmasked faces that resemble the original faces. We train our model on paired unmasked and masked face images sourced from CelebA, a large public dataset, and evaluate its performance on multi-scale masked faces. The experimental results illustrate that the proposed method surpasses the current state-of-the-art techniques in both qualitative and quantitative metrics. It achieves a Peak Signal-to-Noise Ratio (PSNR) improvement of 4.18 dB over the second-best method, with the PSNR reaching 30.96. Additionally, it exhibits a 1% increase in the Structural Similarity Index Measure (SSIM), achieving a value of 0.95. Full article
(This article belongs to the Special Issue Deep Learning Based Face Recognition and Feature Extraction)
Show Figures

Figure 1

Figure 1
<p>Overview of our two-stage approach for face unmasking. The first stage, Mask Segmentation, takes the masked face as input and generates a binary mask map. The second stage, Face Unmasking (GAN-Net), utilizes the masked face and the generated mask map from the first stage to generate the unmasked face.</p>
Full article ">Figure 2
<p>Comprehensive network architecture for face unmasking, comprising two key stages: (1) mask segmentation (M-Seg) utilizing an autoencoder architecture, and (2) face unmasking employing a GAN network. The generator module incorporates a residual attention mechanism and the MURF block, enhancing the efficacy of the unmasking process. The VGG19 network serves as the perceptual network.</p>
Full article ">Figure 3
<p>Detailed depiction of the main components: (<b>a</b>) Se-Block used for binary mask segmentation in the first stage, and (<b>b</b>) Ge-Block representing the generator block.</p>
Full article ">Figure 4
<p>Examples of the diverse masks used in the synthetic dataset. The masks exhibit different shapes, sizes, and colors, comprehensively representing various masking scenarios.</p>
Full article ">Figure 5
<p>Samples from the synthetic dataset illustrating different sizes (256 × 256 and 512 × 512). The first column displays the original unmasked face images, the second column shows the corresponding masked face images, and the third column represents the binary mask map.</p>
Full article ">Figure 6
<p>Qualitative comparison with state-of-the-art methods. The first column shows the input masked images, followed by four output columns obtained by GLCIC, Gated-Conv, GUMF, and our model, respectively, while the last column displays the corresponding ground truth images.</p>
Full article ">Figure 7
<p>Some samples of the proposed model’s results on front-facing images. Each sample includes the input masked face, the generated binary mask map, the proposed model’s output, and the ground truth.</p>
Full article ">Figure 8
<p>Some samples of the proposed model’s results on side-facing images. Each sample includes the input masked face, the generated binary mask map, the proposed model’s output, and the ground truth.</p>
Full article ">Figure 9
<p>Sample results of our model on images of size 512 × 512, demonstrating its effectiveness on various mask types, sizes, angles, and colors. The first column depicts the masked input face, followed by the binary mask map generated in the first stage. The third column showcases our model’s output, while the fourth column presents the corresponding ground truth for comparison.</p>
Full article ">Figure 10
<p>Failure results—challenging cases and mask map detection errors.</p>
Full article ">Figure 11
<p>Comparison of face unmasking results between the two- and one-stage models. The left column shows input masked faces, followed by the outputs from the one-stage model. The third column displays outputs from the two-stage model, while the last column presents ground truth unmasked faces.</p>
Full article ">
13 pages, 4992 KiB  
Article
Efficient Face Region Occlusion Repair Based on T-GANs
by Qiaoyue Man and Young-Im Cho
Electronics 2023, 12(10), 2162; https://doi.org/10.3390/electronics12102162 - 9 May 2023
Cited by 1 | Viewed by 1683
Abstract
In the image restoration task, the generative adversarial network (GAN) demonstrates excellent performance. However, there remain significant challenges concerning the task of generative face region inpainting. Traditional model approaches are ineffective in maintaining global consistency among facial components and recovering fine facial details. [...] Read more.
In the image restoration task, the generative adversarial network (GAN) demonstrates excellent performance. However, there remain significant challenges concerning the task of generative face region inpainting. Traditional model approaches are ineffective in maintaining global consistency among facial components and recovering fine facial details. To address this challenge, this study proposes a facial restoration generation network combined a transformer module and GAN to accurately detect the missing feature parts of the face and perform effective and fine-grained restoration generation. We validate the proposed model using different image quality evaluation methods and several open-source face datasets and experimentally demonstrate that our model outperforms other current state-of-the-art network models in terms of generated image quality and the coherent naturalness of facial features in face image restoration generation tasks. Full article
(This article belongs to the Special Issue AI Technologies and Smart City)
Show Figures

Figure 1

Figure 1
<p>T-GANs framework.</p>
Full article ">Figure 2
<p>Transformer module architecture.</p>
Full article ">Figure 3
<p>Facial restoration generative network architecture.</p>
Full article ">Figure 4
<p>Open-source datasets.</p>
Full article ">Figure 5
<p>Facial missing feature mask.</p>
Full article ">Figure 6
<p>The missing features of different parts of the face generate contrast.</p>
Full article ">Figure 7
<p>Large-area face feature loss (wearing a mask) generated result comparison.</p>
Full article ">Figure 8
<p>Comparison of the restoration effect of different generative networks for different missing facial features.</p>
Full article ">Figure 9
<p>Image restoration generation for large facial feature loss.</p>
Full article ">
16 pages, 7351 KiB  
Article
A Fast Specular Highlight Removal Method for Smooth Liquor Bottle Surface Combined with U2-Net and LaMa Model
by Shaojie Guo, Xiaogang Wang, Jiayi Zhou and Zewei Lian
Sensors 2022, 22(24), 9834; https://doi.org/10.3390/s22249834 - 14 Dec 2022
Cited by 5 | Viewed by 1814
Abstract
Highlight removal is a critical and challenging problem. In view of the complex highlight phenomenon on the surface of smooth liquor bottles in natural scenes, the traditional highlight removal algorithms cannot semantically disambiguate between all-white or near-white materials and highlights, and the recent [...] Read more.
Highlight removal is a critical and challenging problem. In view of the complex highlight phenomenon on the surface of smooth liquor bottles in natural scenes, the traditional highlight removal algorithms cannot semantically disambiguate between all-white or near-white materials and highlights, and the recent highlight removal algorithms based on deep learning lack flexibility in network architecture, have network training difficulties and have insufficient object applicability. As a result, they cannot accurately locate and remove highlights in the face of some small sample highlight datasets with strong pertinence, which reduces the performance of some tasks. Therefore, this paper proposes a fast highlight removal method combining U2-Net and LaMa. The method consists of two stages. In the first stage, the U2-Net network is used to detect the specular reflection component in the liquor bottle input image and generate the mask map for the highlight area in batches. In the second stage, the liquor bottle input image and the mask map generated by the U2-Net are input to the LaMa network, and the surface highlights of the smooth liquor bottle are removed by relying on the powerful image inpainting performance of LaMa. Experiments on our self-made liquor bottle surface highlight dataset showed that this method outperformed other advanced methods in highlight detection and removal. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>The green box is the false detection of defects caused by highlights.</p>
Full article ">Figure 2
<p>Experimental objects and results of traditional algorithm based on two-color reflection model.</p>
Full article ">Figure 3
<p>High quality pixel clustering method based on two-color reflection model: (<b>a</b>) surface color dim and generally smooth; (<b>b</b>) colorful and smooth surface.</p>
Full article ">Figure 4
<p>Sun et al.’s processing results of simple highlight areas on ceramic bottles.</p>
Full article ">Figure 5
<p>Sun et al.’s processing results of multiregion complex highlights on smooth liquor bottles: (<b>a</b>) specular reflection component separation results; (<b>b</b>) the result of the area filling method processing on the basis of the former.</p>
Full article ">Figure 6
<p>Experimental results of sparse and low-rank reflection model on Guo et al.’s dataset (the first row is the input highlight image; the second row is highlight processing map).</p>
Full article ">Figure 7
<p>Experimental results of the same method on self-made datasets (the left side is the input highlight image; the right side is the highlight processing image).</p>
Full article ">Figure 8
<p>Test results of a multitask network model for joint highlight detection and removal: (<b>a</b>) input image; (<b>b</b>) output image.</p>
Full article ">Figure 9
<p>Flowchart of highlight processing combining U<sup>2</sup>-Net and LaMa networks.</p>
Full article ">Figure 10
<p>RSU-L.</p>
Full article ">Figure 11
<p>The fast Fourier convolution model.</p>
Full article ">Figure 12
<p>PR curve and comprehensive evaluation index curve of U<sup>2</sup>-Net model on self-made dataset: (<b>a</b>) PR curve; (<b>b</b>) comprehensive evaluation index curve.</p>
Full article ">Figure 13
<p>Partial test results of our U<sup>2</sup>-Net on real surface highlight bottle images: (<b>a</b>) original input image; (<b>b</b>) ground Truth; (<b>c</b>) result map.</p>
Full article ">Figure 13 Cont.
<p>Partial test results of our U<sup>2</sup>-Net on real surface highlight bottle images: (<b>a</b>) original input image; (<b>b</b>) ground Truth; (<b>c</b>) result map.</p>
Full article ">Figure 14
<p>Visual comparison of our proposed highlight removal method with other state-of-the-art methods (the first row is the input image, and the second row is the image processed by the respective algorithms): (<b>a</b>) Antonio C et al. test result image; (<b>b</b>) Guo et al. test result image; (<b>c</b>) Sun et al. test result image; (<b>d</b>) Fu et al. test result image; (<b>e</b>) our test result image.</p>
Full article ">
13 pages, 10914 KiB  
Article
Face Image Completion Based on GAN Prior
by Xiaofeng Shao, Zhenping Qiang, Fei Dai, Libo He and Hong Lin
Electronics 2022, 11(13), 1997; https://doi.org/10.3390/electronics11131997 - 26 Jun 2022
Cited by 6 | Viewed by 2506
Abstract
Face images are often used in social and entertainment activities to interact with information. However, during the transmission of digital images, there are factors that may destroy or obscure the key elements of the image, which may hinder the understanding of the image’s [...] Read more.
Face images are often used in social and entertainment activities to interact with information. However, during the transmission of digital images, there are factors that may destroy or obscure the key elements of the image, which may hinder the understanding of the image’s content. Therefore, the study of image completion of human faces has become an important research branch in the field of computer image processing. Compared with traditional image inpainting methods, deep-learning-based inpainting methods have significantly improved the results on face images, but in the case of complex semantic information and large missing areas, the completion results are still blurred, and the color of the boundary is inconsistent and does not match human visual perception. To solve this problem, this paper proposes a face completion method based on GAN priori to guide the network to complete face images by directly using the rich and diverse a priori information in the pre-trained GAN. The network model is a coarse-to-fine structure, where the damaged face images and the corresponding masks are first input to the coarse network to obtain the coarse results, and then the coarse results are input to the fine network with multi-resolution skip connections. The fine network uses the a priori information from the pre-trained GAN to guide the network to generate the face images, and finally uses the SN-PatchGAN discriminator to evaluate the completion results. The experiment is performed on the CelebA-HQ dataset. Compared with the latest three completion methods, the qualitative and quantitative experimental analysis shows that our method has obvious improvement in texture and fidelity. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

Figure 1
<p>Completion results from coarse to fine.</p>
Full article ">Figure 2
<p>Face image completion model based on GAN prior.</p>
Full article ">Figure 3
<p>Qualitative comparison results under free-form masks.</p>
Full article ">Figure 4
<p>Qualitative comparison results under the center rectangle mask.</p>
Full article ">Figure 5
<p>The effect of dilation convolution on the completion result.</p>
Full article ">Figure 6
<p>The effect of multi-resolution encoder on completion results.</p>
Full article ">Figure 7
<p>The effect of pre-trained GAN on completion results.</p>
Full article ">Figure 8
<p>The effect of the decoder on the completion result.</p>
Full article ">Figure 9
<p>Comparison of SN-PatchGAN discriminator and global discriminator completion results.</p>
Full article ">
18 pages, 43034 KiB  
Article
Research on High-Resolution Face Image Inpainting Method Based on StyleGAN
by Libo He, Zhenping Qiang, Xiaofeng Shao, Hong Lin, Meijiao Wang and Fei Dai
Electronics 2022, 11(10), 1620; https://doi.org/10.3390/electronics11101620 - 19 May 2022
Cited by 15 | Viewed by 5043
Abstract
In face image recognition and other related applications, incomplete facial imagery due to obscuring factors during acquisition represents an issue that requires solving. Aimed at tackling this issue, the research surrounding face image completion has become an important topic in the field of [...] Read more.
In face image recognition and other related applications, incomplete facial imagery due to obscuring factors during acquisition represents an issue that requires solving. Aimed at tackling this issue, the research surrounding face image completion has become an important topic in the field of image processing. Face image completion methods require the capability of capturing the semantics of facial expression. A deep learning network has been widely shown to bear this ability. However, for high-resolution face image completion, the network training of high-resolution image inpainting is difficult to converge, thus rendering high-resolution face image completion a difficult problem. Based on the study of the deep learning model of high-resolution face image generation, this paper proposes a high-resolution face inpainting method. First, our method extracts the latent vector of the face image to be repaired through ResNet, then inputs the latent vector to the pre-trained StyleGAN model to generate the face image. Next, it calculates the loss between the known part of the face image to be repaired and the corresponding part of the generated face imagery. Afterward, the latent vector is cut to generate a new face image iteratively until the number of iterations is reached. Finally, the Poisson fusion method is employed to process the last generated face image and the face image to be repaired in order to eliminate the difference in boundary color information of the repaired image. Through the comparison and analysis between two classical face completion methods in recent years on the CelebA-HQ data set, we discovered our method can achieve better completion results of 256*256 resolution face image completion. For 1024*1024 resolution face image restoration, we have also conducted a large number of experiments, which prove the effectiveness of our method. Our method can obtain a variety of repair results by editing the latent vector. In addition, our method can be successfully applied to face image editing, face image watermark clearing and other applications without the network training process of different masks in these applications. Full article
(This article belongs to the Special Issue New Advances in Visual Computing and Virtual Reality)
Show Figures

Figure 1

Figure 1
<p>Examples of the repair results of our method.</p>
Full article ">Figure 2
<p>Network structure diagram of our method.</p>
Full article ">Figure 3
<p>Comparison of completion results with or without ResNet prediction.</p>
Full article ">Figure 4
<p>Comparison of completion results with different loss functions. (<b>a</b>) Masked images, (<b>b</b>–<b>e</b>) Inpainted results using <math display="inline"><semantics> <msub> <mi>L</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>g</mi> <mtext>-</mtext> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi>h</mi> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>S</mi> <mi>S</mi> <mi>I</mi> <mi>M</mi> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>M</mi> <mi>S</mi> <mtext>-</mtext> <mi>S</mi> <mi>S</mi> <mi>I</mi> <mi>M</mi> </mrow> </semantics></math> losses, respectively. (<b>f</b>) Original images.</p>
Full article ">Figure 5
<p>Comparison of completion results with the combination of different types of loss functions. (<b>a</b>) Original images, (<b>b</b>) Masked images, (<b>c</b>) Generated images by using <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>V</mi> <mi>G</mi> <mi>G</mi> </mrow> </msub> </mrow> </semantics></math> loss, (<b>d</b>) Generated images by using <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>V</mi> <mi>G</mi> <mi>G</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <msub> <mi>L</mi> <mn>2</mn> </msub> </msub> </mrow> </semantics></math> losses, (<b>e</b>) Generated images by using <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>V</mi> <mi>G</mi> <mi>G</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>L</mi> <mi>o</mi> <mi>g</mi> <mtext>-</mtext> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi>h</mi> </mrow> </msub> </mrow> </semantics></math> losses, (<b>f</b>) Generated images by using <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>V</mi> <mi>G</mi> <mi>G</mi> </mrow> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>L</mi> <mi>o</mi> <mi>g</mi> <mtext>-</mtext> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi>h</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>M</mi> <mi>S</mi> <mo>-</mo> <mi>S</mi> <mi>S</mi> <mi>I</mi> <mi>M</mi> </mrow> </msub> </mrow> </semantics></math> losses, (<b>g</b>) Generated images by using <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>V</mi> <mi>G</mi> <mi>G</mi> </mrow> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>L</mi> <mi>o</mi> <mi>g</mi> <mtext>-</mtext> <mi>C</mi> <mi>o</mi> <mi>s</mi> <mi>h</mi> </mrow> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <msub> <mi>s</mi> <mrow> <mi>M</mi> <mi>S</mi> <mtext>-</mtext> <mi>S</mi> <mi>S</mi> <mi>I</mi> <mi>M</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <mi>s</mi> </mrow> </semantics></math>-<math display="inline"><semantics> <mrow> <mi>L</mi> <mi>P</mi> <mi>I</mi> <mi>P</mi> <mi>S</mi> </mrow> </semantics></math> losses, (<b>h</b>) are the inpainted images obtained by Poisson fusion on the basis of (<b>f</b>).</p>
Full article ">Figure 6
<p>The images generated by the intermediate process of our method when the images are completed.</p>
Full article ">Figure 7
<p>Completion results on the center rectangle mask. (<b>a</b>) Masked images. (<b>b</b>) GLCIC results [<a href="#B8-electronics-11-01620" class="html-bibr">8</a>]. (<b>c</b>) CE results [<a href="#B5-electronics-11-01620" class="html-bibr">5</a>]. (<b>d</b>) Our results. (<b>e</b>) Original images.</p>
Full article ">Figure 8
<p>Completion results on the large-area rectangular masks. (<b>a</b>) Masked images. (<b>b</b>) GLCIC results [<a href="#B8-electronics-11-01620" class="html-bibr">8</a>]. (<b>c</b>) Our results. (<b>d</b>) Original images. (<b>e</b>) Masked images. (<b>f</b>) GLCIC results [<a href="#B8-electronics-11-01620" class="html-bibr">8</a>]. (<b>g</b>) Our results. (<b>h</b>) Original images.</p>
Full article ">Figure 9
<p>Completion results on the large-area irregular masks. (<b>a</b>) Masked images. (<b>b</b>) GLCIC results [<a href="#B8-electronics-11-01620" class="html-bibr">8</a>]. (<b>c</b>) Ours results. (<b>d</b>) Original images.</p>
Full article ">Figure 10
<p>Inpainting results of masked images with different proportion noise. The first to fourth rows are 20%, 30%, 40% and 50% noise masks, respectively.</p>
Full article ">Figure 11
<p>Inpainting results of masked images with different proportion free-form brush. The first to fourth rows are 10–20%, 20–30%, 30–40% and 40–50% random free-form brush masks, respectively.</p>
Full article ">Figure 12
<p>Inpainting results for 90% random noise masked face images.</p>
Full article ">Figure 13
<p>The graph of the total weighted loss changes during the iterative inpainting process.</p>
Full article ">Figure 14
<p>The graph of the average completion time of 1000 face images with random noise masks and random free-form brush masks.</p>
Full article ">Figure 15
<p>Different application experiments on face image inpainting. The experiments include watermark removal, text removal, and image editing.</p>
Full article ">Figure 16
<p>An example of a failed face image inpainting.</p>
Full article ">Figure 17
<p>Examples of diversity inpainting results.</p>
Full article ">
14 pages, 25324 KiB  
Article
Convincing 3D Face Reconstruction from a Single Color Image under Occluded Scenes
by Dapeng Zhao, Jinkang Cai and Yue Qi
Electronics 2022, 11(4), 543; https://doi.org/10.3390/electronics11040543 - 11 Feb 2022
Cited by 3 | Viewed by 3480
Abstract
The last few years have witnessed the great success of generative adversarial networks (GANs) in synthesizing high-quality photorealistic face images. Many recent 3D facial texture reconstruction works often pursue higher resolutions and ignore occlusion. We study the problem of detailed 3D facial reconstruction [...] Read more.
The last few years have witnessed the great success of generative adversarial networks (GANs) in synthesizing high-quality photorealistic face images. Many recent 3D facial texture reconstruction works often pursue higher resolutions and ignore occlusion. We study the problem of detailed 3D facial reconstruction under occluded scenes. This is a challenging problem; currently, the collection of such a large scale high resolution 3D face dataset is still very costly. In this work, we propose a deep learning based approach for detailed 3D face reconstruction that does not require large-scale 3D datasets. Motivated by generative face image inpainting and weakly-supervised 3D deep reconstruction, we propose a complete 3D face model generation method guided by the contour. In our work, the 3D reconstruction framework based on weak supervision can generate convincing 3D models. We further test our method on the MICC, Florence and LFW datasets, showing its strong generalization capacity and superior performance. Full article
(This article belongs to the Special Issue New Advances in Visual Computing and Virtual Reality)
Show Figures

Figure 1

Figure 1
<p>Method overview. See related sections for details.</p>
Full article ">Figure 2
<p>Our face mask generation module. It is slightly different from the traditional face parsing task. The traditional face parsing task is to recognize the face as different components (usually including eyebrows, eyes, nose, mouth, facial skin and so on). Corresponding to it is the face parsing map (different face components are represented by different gray values). Our mask generation task is only to recognize the occluded area. The corresponding face mask map is a binary map.</p>
Full article ">Figure 3
<p>Comparison of qualitative results. Baseline methods from left to right: 3DDFA, PRNet, <math display="inline"><semantics> <mrow> <mi mathvariant="normal">D</mi> <msup> <mrow> <mi mathvariant="normal">F</mi> </mrow> <mn>2</mn> </msup> <mi>Net</mi> </mrow> </semantics></math>, Chen et al. and our method. The blank area means that this method does not work.</p>
Full article ">Figure 4
<p>Comparison of error heat maps on the 3D shape recovery on MICC Florence datasets. Digits denote <math display="inline"><semantics> <mrow> <mn>90</mn> <mo>%</mo> </mrow> </semantics></math> error (mm).</p>
Full article ">Figure 5
<p>Basic shape reconstructions with natural occlusions. (<b>Left</b>): Qualitative results of Sela et al. [<a href="#B95-electronics-11-00543" class="html-bibr">95</a>], and our shape. (<b>Right</b>): LFW verification ROC for the shapes, with and without occlusions.</p>
Full article ">
16 pages, 4335 KiB  
Article
Inpainted Image Reconstruction Using an Extended Hopfield Neural Network Based Machine Learning System
by Wieslaw Citko and Wieslaw Sienko
Sensors 2022, 22(3), 813; https://doi.org/10.3390/s22030813 - 21 Jan 2022
Cited by 10 | Viewed by 2400
Abstract
This paper considers the use of a machine learning system for the reconstruction and recognition of distorted or damaged patterns, in particular, images of faces partially covered with masks. The most up-to-date image reconstruction structures are based on constrained optimization algorithms and suitable [...] Read more.
This paper considers the use of a machine learning system for the reconstruction and recognition of distorted or damaged patterns, in particular, images of faces partially covered with masks. The most up-to-date image reconstruction structures are based on constrained optimization algorithms and suitable regularizers. In contrast with the above-mentioned image processing methods, the machine learning system presented in this paper employs the superposition of system vectors setting up asymptotic centers of attraction. The structure of the system is implemented using Hopfield-type neural network-based biorthogonal transformations. The reconstruction property gives rise to a superposition processor and reversible computations. Moreover, this paper’s distorted image reconstruction sets up associative memories where images stored in memory are retrieved by distorted/inpainted key images. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>Structure of the machine learning model for image processing.</p>
Full article ">Figure 2
<p>Block diagram of the approximator with lumped memory.</p>
Full article ">Figure 3
<p>Face images saved (source: <a href="https://pixabay.com/pl" target="_blank">https://pixabay.com/pl</a>, accessed on 17 February 2021).</p>
Full article ">Figure 4
<p>Reconstruction of face images of people wearing masks.</p>
Full article ">Figure 5
<p>Attempt to recognize an unsaved photo.</p>
Full article ">Figure 6
<p>Masked image of the face in Photo Number 9 (<a href="#sensors-22-00813-f003" class="html-fig">Figure 3</a>).</p>
Full article ">Figure 7
<p>Structure of the reconstruction system when a fragment of the image (k–lines) is kept as the input.</p>
Full article ">Figure 8
<p>Image reconstruction process in <a href="#sensors-22-00813-f006" class="html-fig">Figure 6</a> (after 1, 2, 5, 10, and 100 iterations).</p>
Full article ">Figure 9
<p>Image reconstruction of Lena’s photo (reconstruction system in <a href="#sensors-22-00813-f007" class="html-fig">Figure 7</a>).</p>
Full article ">Figure 10
<p>Reconstruction of distorted images (Items 10 and 14 in <a href="#sensors-22-00813-t004" class="html-table">Table 4</a>).</p>
Full article ">Figure 11
<p>The plots of a function: MSE vs. S/N.</p>
Full article ">Figure 12
<p>Original image and its transformation (projection).</p>
Full article ">Figure 13
<p>Structure of the system implementing inverse transformation. (<b>a</b>) <math display="inline"><semantics> <mrow> <msub> <mi mathvariant="bold-italic">y</mi> <mi>i</mi> </msub> </mrow> </semantics></math>—undegenerated image projection; (<b>b</b>) <math display="inline"><semantics> <mrow> <msub> <mover accent="true"> <mi mathvariant="bold-italic">y</mi> <mo>˜</mo> </mover> <mi>i</mi> </msub> </mrow> </semantics></math>—degenerated image projection.</p>
Full article ">Figure 14
<p>An exemplary reconstruction (F (·)—system from <a href="#sensors-22-00813-f013" class="html-fig">Figure 13</a>b).</p>
Full article ">Figure 15
<p>Multilayer learning structure (K—number of steps; e.g., K = 100).</p>
Full article ">Figure 16
<p>Multilayer learning structure (L—number of steps; e.g., L = 100).</p>
Full article ">Figure 17
<p>Illustration of global attractor properties.</p>
Full article ">Figure 18
<p>Complex-valued image reconstruction: <math display="inline"><semantics> <mrow> <msub> <mi mathvariant="bold-italic">z</mi> <mrow> <mn>43</mn> </mrow> </msub> <mo>=</mo> <msub> <mi mathvariant="bold-italic">x</mi> <mn>4</mn> </msub> <mo>+</mo> <mi>j</mi> <msub> <mi mathvariant="bold-italic">x</mi> <mrow> <mn>3</mn> <mo> </mo> </mrow> </msub> <mo>,</mo> <mo> </mo> <msup> <mi>j</mi> <mn>2</mn> </msup> <mo>=</mo> <mo>−</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi mathvariant="bold-italic">x</mi> <mn>3</mn> </msub> <mo>,</mo> <msub> <mi mathvariant="bold-italic">x</mi> <mn>4</mn> </msub> </mrow> </semantics></math>—vectorized form of images No. 3 and No. 4 in <a href="#sensors-22-00813-f003" class="html-fig">Figure 3</a>, <math display="inline"><semantics> <mrow> <msubsup> <mi mathvariant="bold-italic">x</mi> <mn>3</mn> <mrow> <mfenced> <mi>s</mi> </mfenced> </mrow> </msubsup> <mo>,</mo> <mo> </mo> <msubsup> <mi mathvariant="bold-italic">x</mi> <mn>4</mn> <mrow> <mfenced> <mi>s</mi> </mfenced> </mrow> </msubsup> </mrow> </semantics></math>—distorted images.</p>
Full article ">
28 pages, 34010 KiB  
Article
Hair Removal Combining Saliency, Shape and Color
by Giuliana Ramella
Appl. Sci. 2021, 11(1), 447; https://doi.org/10.3390/app11010447 - 5 Jan 2021
Cited by 11 | Viewed by 3957
Abstract
In a computer-aided system for skin cancer diagnosis, hair removal is one of the main challenges to face before applying a process of automatic skin lesion segmentation and classification. In this paper, we propose a straightforward method to detect and remove hair from [...] Read more.
In a computer-aided system for skin cancer diagnosis, hair removal is one of the main challenges to face before applying a process of automatic skin lesion segmentation and classification. In this paper, we propose a straightforward method to detect and remove hair from dermoscopic images. Preliminarily, the regions to consider as candidate hair regions and the border/corner components located on the image frame are automatically detected. Then, the hair regions are determined using information regarding the saliency, shape and image colors. Finally, the detected hair regions are restored by a simple inpainting method. The method is evaluated on a publicly available dataset, comprising 340 images in total, extracted from two commonly used public databases, and on an available specific dataset including 13 images already used by other authors for evaluation and comparison purposes. We propose also a method for qualitative and quantitative evaluation of a hair removal method. The results of the evaluation are promising as the detection of the hair regions is accurate, and the performance results are satisfactory in comparison to other existing hair removal methods. Full article
(This article belongs to the Special Issue Advanced Image Analysis and Processing for Biomedical Applications)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Examples of images (IMD003 and ISIC_0002871) with a massive presence of hair; (<b>b</b>) the resulting image after the application of saliency shape color for hair removal (HR-SSC).</p>
Full article ">Figure 2
<p>Flowchart of the proposed method (HR-SSC).</p>
Full article ">Figure 3
<p>Some examples of results obtained in the main steps of HR-SSC: (<b>a</b>) input image; (<b>b</b>) detected pseudo-hair components; (<b>c</b>) border/corner components; (<b>d</b>) detected hair; (<b>e</b>) resulting image.</p>
Full article ">Figure 4
<p>Image dataset <span class="html-italic">NH13-data</span> proposed in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>].</p>
Full article ">Figure 5
<p>Image dataset <span class="html-italic">H13GAN-data</span> generated by applying the GAN method [<a href="#B38-applsci-11-00447" class="html-bibr">38</a>] to <span class="html-italic">NH13-data</span> and published in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>].</p>
Full article ">Figure 6
<p>Image dataset <span class="html-italic">H13Sim-data</span> generated by applying the HairSim method [<a href="#B39-applsci-11-00447" class="html-bibr">39</a>] to <span class="html-italic">NH13-data</span> and published in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>].</p>
Full article ">Figure 7
<p>Image dataset <span class="html-italic">sNH-data</span> selected randomly from <span class="html-italic">NH-data</span>.</p>
Full article ">Figure 8
<p>Image dataset <span class="html-italic">sHSim-data</span> with the hair mask produced by applying the HairSim method to <span class="html-italic">sNH-data.</span></p>
Full article ">Figure 9
<p>Image dataset <span class="html-italic">sH-data</span> selected randomly from <span class="html-italic">H-data</span>.</p>
Full article ">Figure 10
<p>(<b>a</b>) Results of methods Lee, Xie, Abbas, Huang available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–4, on <span class="html-italic">H13GAN-data.</span> (<b>b</b>) Results of methods Toossi, Bibiloni available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–2, and results of HR-SSC, row 3, on <span class="html-italic">H13GAN-data.</span></p>
Full article ">Figure 10 Cont.
<p>(<b>a</b>) Results of methods Lee, Xie, Abbas, Huang available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–4, on <span class="html-italic">H13GAN-data.</span> (<b>b</b>) Results of methods Toossi, Bibiloni available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–2, and results of HR-SSC, row 3, on <span class="html-italic">H13GAN-data.</span></p>
Full article ">Figure 11
<p>(<b>a</b>) Results of methods Lee, Xie, Abbas, Huang available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–4, on <span class="html-italic">H13Sim-data.</span> (<b>b</b>) Results of method Toossi, Bibiloni available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–2, and results of HR-SSC, row 3, on <span class="html-italic">H13Sim-data.</span></p>
Full article ">Figure 11 Cont.
<p>(<b>a</b>) Results of methods Lee, Xie, Abbas, Huang available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–4, on <span class="html-italic">H13Sim-data.</span> (<b>b</b>) Results of method Toossi, Bibiloni available in [<a href="#B37-applsci-11-00447" class="html-bibr">37</a>], rows 1–2, and results of HR-SSC, row 3, on <span class="html-italic">H13Sim-data.</span></p>
Full article ">Figure 12
<p>Results of methods Lee, Xie, and HR-SSC on <span class="html-italic">sHSim-data</span>.</p>
Full article ">Figure 13
<p>Resulting mask of HairSim method and the resulting mask of methods Lee, Xie, and HR-SSC on <span class="html-italic">sHSim-data</span>.</p>
Full article ">Figure 14
<p>Results of methods Lee, Xie, and HR-SSC on <span class="html-italic">sH-data</span>.</p>
Full article ">Figure 15
<p>Resulting masks of methods Lee, Xie, and HR-SSC on <span class="html-italic">sH-data</span>.</p>
Full article ">Figure 16
<p>Trends of quality measures on <span class="html-italic">H13Sim-data</span> for the methods Lee, Xie, and HR-SSC.</p>
Full article ">Figure 17
<p>Trends of quality measures on <span class="html-italic">sHSim-data</span> for the methods Lee, Xie, and HR-SSC.</p>
Full article ">
Back to TopTop