[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (21)

Search Parameters:
Keywords = AIGC

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 1498 KiB  
Article
Fostering Continuous Innovation in Creative Education: A Multi-Path Configurational Analysis of Continuous Collaboration with AIGC in Chinese ACG Educational Contexts
by Juan Huangfu, Ruoyuan Li, Junping Xu and Younghwan Pan
Sustainability 2025, 17(1), 144; https://doi.org/10.3390/su17010144 - 27 Dec 2024
Viewed by 850
Abstract
AI-generated content (AIGC) is uniquely positioned to drive the digital transformation of professional education in the animation, comic, and game (ACG) industries. However, its collaborative application also faces initial novelty effects and user discontinuance. Existing studies often employ single-variable analytical methods, which struggle [...] Read more.
AI-generated content (AIGC) is uniquely positioned to drive the digital transformation of professional education in the animation, comic, and game (ACG) industries. However, its collaborative application also faces initial novelty effects and user discontinuance. Existing studies often employ single-variable analytical methods, which struggle to capture the complex mechanisms influencing technology adoption. This study innovatively combines necessary condition analysis (NCA) and fuzzy-set qualitative comparative analysis (fsQCA) and applies them to the field of ACG education. Using this mixed-method approach, it systematically explores the necessary conditions and configurational effects influencing educational users’ continuance intention to adopt AIGC tools for collaborative design learning, aiming to address existing research gaps. A survey of 312 Chinese ACG educational users revealed that no single factor constitutes a necessary condition for their continuance intention to adopt AIGC tools. Additionally, five pathways leading to high adoption intention and three pathways leading to low adoption intention were identified. Notably, the absence or insufficiency of task–technology fit, and perceived quality do not hinder ACG educational users’ willingness to actively adopt AIGC tools. This reflects the creativity-driven learning characteristics, and the flexible and diverse tool demands of the ACG discipline. The findings provide theoretical and empirical insights to enhance the effective synergy and sustainable development between ACG education and AIGC tools. Full article
(This article belongs to the Special Issue Artificial Intelligence in Education and Sustainable Development)
Show Figures

Figure 1

Figure 1
<p>AIGC in ACG production.</p>
Full article ">Figure 2
<p>Conceptual framework.</p>
Full article ">Figure 3
<p>The process of NCA and fsQCA analysis.</p>
Full article ">
33 pages, 3827 KiB  
Review
Distinguishing Reality from AI: Approaches for Detecting Synthetic Content
by David Ghiurău and Daniela Elena Popescu
Computers 2025, 14(1), 1; https://doi.org/10.3390/computers14010001 - 24 Dec 2024
Viewed by 660
Abstract
The advancement of artificial intelligence (AI) technologies, including generative pre-trained transformers (GPTs) and generative models for text, image, audio, and video creation, has revolutionized content generation, creating unprecedented opportunities and critical challenges. This paper systematically examines the characteristics, methodologies, and challenges associated with [...] Read more.
The advancement of artificial intelligence (AI) technologies, including generative pre-trained transformers (GPTs) and generative models for text, image, audio, and video creation, has revolutionized content generation, creating unprecedented opportunities and critical challenges. This paper systematically examines the characteristics, methodologies, and challenges associated with detecting the synthetic content across multiple modalities, to safeguard digital authenticity and integrity. Key detection approaches reviewed include stylometric analysis, watermarking, pixel prediction techniques, dual-stream networks, machine learning models, blockchain, and hybrid approaches, highlighting their strengths and limitations, as well as their detection accuracy, independent accuracy of 80% for stylometric analysis and up to 92% using multiple modalities in hybrid approaches. The effectiveness of these techniques is explored in diverse contexts, from identifying deepfakes and synthetic media to detecting AI-generated scientific texts. Ethical concerns, such as privacy violations, algorithmic bias, false positives, and overreliance on automated systems, are also critically discussed. Furthermore, the paper addresses legal and regulatory frameworks, including intellectual property challenges and emerging legislation, emphasizing the need for robust governance to mitigate misuse. Real-world examples of detection systems are analyzed to provide practical insights into implementation challenges. Future directions include developing generalizable and adaptive detection models, hybrid approaches, fostering collaboration between stakeholders, and integrating ethical safeguards. By presenting a comprehensive overview of AIGC detection, this paper aims to inform stakeholders, researchers, policymakers, and practitioners on addressing the dual-edged implications of AI-driven content creation. Full article
Show Figures

Figure 1

Figure 1
<p>Interest over time for artificial intelligence according to search engines [<a href="#B6-computers-14-00001" class="html-bibr">6</a>].</p>
Full article ">Figure 2
<p>Prisma flow chart with the total number of studies and reports included.</p>
Full article ">Figure 3
<p>Audio waveform analyzed for specific mismatches in tonality, rhythm, and fluency.</p>
Full article ">Figure 4
<p>Dall-E generated image of an apple.</p>
Full article ">Figure 5
<p>Analyzed AIGC image using Sight Engine.</p>
Full article ">Figure 6
<p>Video frame extraction for authenticity analysis.</p>
Full article ">Figure 7
<p>Accuracy of identifying generated and manipulated content [<a href="#B30-computers-14-00001" class="html-bibr">30</a>,<a href="#B31-computers-14-00001" class="html-bibr">31</a>].</p>
Full article ">Figure 8
<p>The process of watermarking an image [<a href="#B35-computers-14-00001" class="html-bibr">35</a>].</p>
Full article ">Figure 9
<p>Regularization technique flow with specific components.</p>
Full article ">Figure 10
<p>News submission on a blockchain ledger with crowdsourcing consensus.</p>
Full article ">
23 pages, 35878 KiB  
Article
A Novel Face Swapping Detection Scheme Using the Pseudo Zernike Transform Based Robust Watermarking
by Zhimao Lai, Zhuangxi Yao, Guanyu Lai, Chuntao Wang and Renhai Feng
Electronics 2024, 13(24), 4955; https://doi.org/10.3390/electronics13244955 - 16 Dec 2024
Viewed by 504
Abstract
The rapid advancement of Artificial Intelligence Generated Content (AIGC) has significantly accelerated the evolution of Deepfake technology, thereby introducing escalating social risks due to its potential misuse. In response to these adverse effects, researchers have developed defensive measures, including passive detection and proactive [...] Read more.
The rapid advancement of Artificial Intelligence Generated Content (AIGC) has significantly accelerated the evolution of Deepfake technology, thereby introducing escalating social risks due to its potential misuse. In response to these adverse effects, researchers have developed defensive measures, including passive detection and proactive forensics. Although passive detection has achieved some success in identifying Deepfakes, it encounters challenges such as poor generalization and decreased accuracy, particularly when confronted with anti-forensic techniques and adversarial noise. As a result, proactive forensics, which offers a more resilient defense mechanism, has garnered considerable scholarly interest. However, existing proactive forensic methodologies often fall short in terms of visual quality, detection accuracy, and robustness. To address these deficiencies, we propose a novel proactive forensic approach that utilizes pseudo-Zernike moment robust watermarking. This method is specifically designed to enhance the detection and analysis of face swapping by transforming facial data into a binary bit stream and embedding this information within the non-facial regions of video frames. Our approach facilitates the detection of Deepfakes while preserving the visual integrity of the video content. Comprehensive experimental evaluations have demonstrated the robustness of this method against standard signal processing operations and its superior performance in detecting Deepfake manipulations. Full article
(This article belongs to the Special Issue Network Security Management in Heterogeneous Networks)
Show Figures

Figure 1

Figure 1
<p>The fundamental principle of face swapping.</p>
Full article ">Figure 2
<p>Results of two face-swapping algorithms and difference maps between ground-truth target images and face-swapping images.</p>
Full article ">Figure 3
<p>Diagram compares existing watermark-based active detection methods with the method proposed.</p>
Full article ">Figure 4
<p>The framework of our method.</p>
Full article ">Figure 5
<p>The process of generating robust watermarked images.</p>
Full article ">Figure 6
<p>The determination of the watermark embedding region. (<b>a</b>) The facial region is divided into <math display="inline"><semantics> <mrow> <mn>5</mn> <mo>×</mo> <mn>5</mn> </mrow> </semantics></math> non-overlapping blocks. (<b>b</b>) The black areas are not used for watermark embedding.</p>
Full article ">Figure 7
<p>The process of generating robust watermarked images.</p>
Full article ">Figure 8
<p>The process of detection for face swapping.</p>
Full article ">Figure 9
<p>ACC of different thresholds under various types of conventional signal attacks.</p>
Full article ">Figure 10
<p>The visual effects of watermarking image and Deepfake image. (<b>a</b>) Source image: PSNR = Inf, SSIM = 1.0; (<b>b</b>) Watermarking image: PSNR = 38.76, SSIM = 0.9710; (<b>c</b>) DeepFaceLab image: PSNR = 33.46, SSIM = 0.9680.</p>
Full article ">
20 pages, 25584 KiB  
Article
LIDeepDet: Deepfake Detection via Image Decomposition and Advanced Lighting Information Analysis
by Zhimao Lai, Jicheng Li, Chuntao Wang, Jianhua Wu and Donghua Jiang
Electronics 2024, 13(22), 4466; https://doi.org/10.3390/electronics13224466 - 14 Nov 2024
Viewed by 813
Abstract
The proliferation of AI-generated content (AIGC) has empowered non-experts to create highly realistic Deepfake images and videos using user-friendly software, posing significant challenges to the legal system, particularly in criminal investigations, court proceedings, and accident analyses. The absence of reliable Deepfake verification methods [...] Read more.
The proliferation of AI-generated content (AIGC) has empowered non-experts to create highly realistic Deepfake images and videos using user-friendly software, posing significant challenges to the legal system, particularly in criminal investigations, court proceedings, and accident analyses. The absence of reliable Deepfake verification methods threatens the integrity of legal processes. In response, researchers have explored deep forgery detection, proposing various forensic techniques. However, the swift evolution of deep forgery creation and the limited generalizability of current detection methods impede practical application. We introduce a new deep forgery detection method that utilizes image decomposition and lighting inconsistency. By exploiting inherent discrepancies in imaging environments between genuine and fabricated images, this method extracts robust lighting cues and mitigates disturbances from environmental factors, revealing deeper-level alterations. A crucial element is the lighting information feature extractor, designed according to color constancy principles, to identify inconsistencies in lighting conditions. To address lighting variations, we employ a face material feature extractor using Pattern of Local Gravitational Force (PLGF), which selectively processes image patterns with defined convolutional masks to isolate and focus on reflectance coefficients, rich in textural details essential for forgery detection. Utilizing the Lambertian lighting model, we generate lighting direction vectors across frames to provide temporal context for detection. This framework processes RGB images, face reflectance maps, lighting features, and lighting direction vectors as multi-channel inputs, applying a cross-attention mechanism at the feature level to enhance detection accuracy and adaptability. Experimental results show that our proposed method performs exceptionally well and is widely applicable across multiple datasets, underscoring its importance in advancing deep forgery detection. Full article
(This article belongs to the Special Issue Deep Learning Approach for Secure and Trustworthy Biometric System)
Show Figures

Figure 1

Figure 1
<p>Imaging process of digital image.</p>
Full article ">Figure 2
<p>Process of image generation using generative adversarial networks.</p>
Full article ">Figure 3
<p>Architecture of the proposed method.</p>
Full article ">Figure 4
<p>Illustration of artifacts in deep learning-generated faces. The right-most image shows over-rendering around the nose area.</p>
Full article ">Figure 5
<p>Illustration of inconsistent iris colors in generated faces.</p>
Full article ">Figure 6
<p>Visualization of illumination maps for real images and four forgery methods from the FF++ database.</p>
Full article ">Figure 7
<p>Face material map after illumination normalization. Abnormal traces in the eye and mouth regions are more noticeable.</p>
Full article ">Figure 8
<p>Visualization of face material maps for the facial regions in real images and four forgery methods from the FF++ database for the same frame.</p>
Full article ">Figure 9
<p>Three-dimensional lighting direction vector.</p>
Full article ">Figure 10
<p>Two-dimensional lighting direction vector.</p>
Full article ">Figure 11
<p>Calculation process of lighting direction.</p>
Full article ">Figure 12
<p>Calculation the angle of lighting direction.</p>
Full article ">Figure 13
<p>Comparison of lighting direction angles between real videos and their corresponding Deepfake videos.</p>
Full article ">
23 pages, 4363 KiB  
Article
Human Adaption to Climate Change: Marine Disaster Risk Reduction in the Era of Intelligence
by Junyao Luo and Aihua Yang
Sustainability 2024, 16(22), 9647; https://doi.org/10.3390/su16229647 - 5 Nov 2024
Viewed by 932
Abstract
With the intensification of global warming and sea level rise, extreme weather and climate events occur frequently, increasing the probability and destructive power of marine disasters. The purpose of this paper is to propose the specific application of artificial intelligence (AI) in marine [...] Read more.
With the intensification of global warming and sea level rise, extreme weather and climate events occur frequently, increasing the probability and destructive power of marine disasters. The purpose of this paper is to propose the specific application of artificial intelligence (AI) in marine disaster risk reduction. First, this paper uses computer vision to assess the vulnerability of the target and then uses CNN-LSTM to forecast tropical cyclones. Second, this paper proposes a social media communication mechanism based on deep learning and a psychological crisis intervention mechanism based on AIGC. In addition, the rescue response system based on an intelligent unmanned platform is also the focus of this research. Third, this paper also attempts to discuss disaster loss assessment and reconstruction based on machine learning and smart city concepts. After proposing specific application measures, this paper proposes three policy recommendations. The first one is improving legislation to break the technological trap of AI. The second one is promoting scientific and technological innovation to break through key technologies of AI. The third one is strengthening coordination and cooperation to build a disaster reduction system that integrates man and machine. The purpose of this paper is to reduce the risk of marine disasters by applying AI. Furthermore, we hope to provide scientific references for sustainability and human adaptation to climate change. Full article
Show Figures

Figure 1

Figure 1
<p>Structure of the SSD model.</p>
Full article ">Figure 2
<p>Internal structure of the RBM model.</p>
Full article ">Figure 3
<p>Internal structure of the DBN model.</p>
Full article ">Figure 4
<p>Internal structure of the LSTM neural network. “×” means a pointwise multiplication operation, while “+” means a pointwise addition operation.</p>
Full article ">Figure 5
<p>Assessment time (<b>a</b>) and accuracy (<b>b</b>) of target vulnerability by experts only, AI only and AI coupled with experts. Gray lines represent mean values, while dark gray lines represent median values. Circles mean anomalous values.</p>
Full article ">Figure 6
<p>Frequency of tropical cyclones landing in China from 1949 to 2022 (<b>a</b>). Frequency of tropical cyclones landing in China from April to December (<b>b</b>). Different colored lines represent the values of the corresponding bar chart.</p>
Full article ">Figure 7
<p>Tropical cyclone intensity prediction based on the CNN-LSTM method. Gray lines represent mean values, while dark gray lines represent median values. Circles mean anomalous values.</p>
Full article ">Figure 8
<p>Prediction of effective wave height based on the SVM method. The observation values (<b>a</b>), prediction values (<b>b</b>), the correlation between observation values and prediction values (<b>c</b>). Black line means 100% correlation and the red line means actual correlation.</p>
Full article ">Figure 9
<p>Wave direction prediction based on the SVM method. Wave direction observation results (<b>a</b>) and prediction results (<b>b</b>).</p>
Full article ">Figure 10
<p>The structure of the social media communication mechanism based on AI.</p>
Full article ">Figure 11
<p>The structure of the rescue response system based on the intelligent unmanned platform.</p>
Full article ">Figure 12
<p>The structure of the marine disaster loss assessment.</p>
Full article ">
18 pages, 8743 KiB  
Article
An Improved YOLOv8-Based Foreign Detection Algorithm for Transmission Lines
by Pingting Duan and Xiao Liang
Sensors 2024, 24(19), 6468; https://doi.org/10.3390/s24196468 - 7 Oct 2024
Viewed by 1315
Abstract
This research aims to overcome three major challenges in foreign object detection on power transmission lines: data scarcity, background noise, and high computational costs. In the improved YOLOv8 algorithm, the newly introduced lightweight GSCDown (Ghost Shuffle Channel Downsampling) module effectively captures subtle image [...] Read more.
This research aims to overcome three major challenges in foreign object detection on power transmission lines: data scarcity, background noise, and high computational costs. In the improved YOLOv8 algorithm, the newly introduced lightweight GSCDown (Ghost Shuffle Channel Downsampling) module effectively captures subtle image features by combining 1 × 1 convolution and GSConv technology, thereby enhancing detection accuracy. CSPBlock (Cross-Stage Partial Block) fusion enhances the model’s accuracy and stability by strengthening feature expression and spatial perception while maintaining the algorithm’s lightweight nature and effectively mitigating the issue of vanishing gradients, making it suitable for efficient foreign object detection in complex power line environments. Additionally, PAM (pooling attention mechanism) effectively distinguishes between background and target without adding extra parameters, maintaining high accuracy even in the presence of background noise. Furthermore, AIGC (AI-generated content) technology is leveraged to produce high-quality images for training data augmentation, and lossless feature distillation ensures higher detection accuracy and reduces false positives. In conclusion, the improved architecture reduces the parameter count by 18% while improving the [email protected] metric by a margin of 5.5 points when compared to YOLOv8n. Compared to state-of-the-art real-time object detection frameworks, our research demonstrates significant advantages in both model accuracy and parameter size. Full article
(This article belongs to the Special Issue Communications and Networking Based on Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>The network structure of GCP-YOLO, an improved YOLOv8 algorithm incorporating GSCDown, CSPBlock, and PAM.</p>
Full article ">Figure 2
<p>The structure of GSCDown.</p>
Full article ">Figure 3
<p>The structure of CSPBlock.</p>
Full article ">Figure 4
<p>The feature map of dimensions <math display="inline"><semantics> <mrow> <mn>20</mn> <mo>×</mo> <mn>20</mn> <mo>×</mo> <mn>1024</mn> </mrow> </semantics></math> from prediction layer 1 is condensed to <math display="inline"><semantics> <mrow> <mn>20</mn> <mo>×</mo> <mn>20</mn> <mo>×</mo> <mn>1</mn> </mrow> </semantics></math> via the average pooling, while the <math display="inline"><semantics> <mrow> <mn>80</mn> <mo>×</mo> <mn>80</mn> <mo>×</mo> <mn>256</mn> </mrow> </semantics></math> feature map from prediction layer 3 is compressed to <math display="inline"><semantics> <mrow> <mn>1</mn> <mo>×</mo> <mn>1</mn> <mo>×</mo> <mn>256</mn> </mrow> </semantics></math> through the max pooling. Subsequently, the resultant feature map undergoes depth and size expansion under the constraints of the sigmoid function, which is employed for element-wise multiplication with the original feature map.</p>
Full article ">Figure 5
<p>Schematic diagram of knowledge distillation.</p>
Full article ">Figure 6
<p>The Photoshop “Generate” image process.</p>
Full article ">Figure 7
<p>Examples of objects generated by AIGC: (<b>a</b>–<b>c</b>) are the original images, and (<b>d</b>–<b>f</b>) are the generated kite, balloon, and plastic bag based on (<b>a</b>–<b>c</b>), respectively.</p>
Full article ">Figure 8
<p>(<b>a</b>) Original image; (<b>b</b>) heat map of YOLOv8n; (<b>c</b>) heat map of our model.</p>
Full article ">
26 pages, 3818 KiB  
Article
Human–AI Co-Drawing: Studying Creative Efficacy and Eye Tracking in Observation and Cooperation
by Yuying Pei, Linlin Wang and Chengqi Xue
Appl. Sci. 2024, 14(18), 8203; https://doi.org/10.3390/app14188203 - 12 Sep 2024
Viewed by 1885
Abstract
Artificial intelligence (AI) tools are rapidly transforming the field of traditional artistic creation, influencing painting processes and human creativity. This study explores human–AI cooperation in real-time artistic drawing by using the AIGC tool KREA.AI. Participants wear eye trackers and perform drawing tasks by [...] Read more.
Artificial intelligence (AI) tools are rapidly transforming the field of traditional artistic creation, influencing painting processes and human creativity. This study explores human–AI cooperation in real-time artistic drawing by using the AIGC tool KREA.AI. Participants wear eye trackers and perform drawing tasks by adjusting the AI parameters. The research aims to investigate the impact of cross-screen and non-cross-screen conditions, as well as different viewing strategies, on cognitive load and the degree of creative stimulation during user–AI collaborative drawing. Adopting a mixed design, it examines the influence of different cooperation modes and visual search methods on creative efficacy and visual perception through eye-tracking data and creativity performance scales. The cross-screen type and task type have a significant impact on total interval duration, number of fixation points, average fixation duration, and average pupil diameter in occlusion decision-making and occlusion hand drawing. There are significant differences in the variables of average gaze duration and average pupil diameter among different task types and cross-screen types. In non-cross-screen situations, occlusion and non-occlusion have a significant impact on average gaze duration and pupil diameter. Tasks in non-cross-screen environments are more sensitive to visual processing. The involvement of AI in hand drawing in non-cross-screen collaborative drawing by designers has a significant impact on their visual perception. These results help us to gain a deeper understanding of user behaviour and cognitive load under different visual tasks and cross-screen conditions. The analysis of the creative efficiency scale data reveals significant differences in designers’ ability to supplement and improve AI ideas across different modes. This indicates that the extent of AI participation in the designer’s hand-drawn creative process significantly impacts the designer’s behaviour when negotiating design ideas with the AI. Full article
Show Figures

Figure 1

Figure 1
<p>KREA.AI user interface (the webpage URL of the software is as follows: <a href="https://www.krea.ai" target="_blank">https://www.krea.ai</a> (accessed on 6 March 2024).</p>
Full article ">Figure 2
<p>Human–AI collaborative mapping mode level map.</p>
Full article ">Figure 3
<p>Design process: human–computer function allocation and cooperation model (self-drawn by the authors).</p>
Full article ">Figure 4
<p>The flow of experimental tasks.</p>
Full article ">Figure 5
<p>Scene diagram of experimental recording of field subjects.</p>
Full article ">Figure 6
<p>Histograms of task time and eye movement data (task time, number of gaze points, gaze duration, and mean pupil size).</p>
Full article ">Figure 7
<p>Boxplots of number of whole fixations for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other. As shown in this Figure, the distribution of number of whole fixations for the four modes is high in Mode 1 and Mode 3, and low in Mode 2 and Mode 4,as shown in this Figure).</p>
Full article ">Figure 8
<p>Boxplots of average duration of whole fixations for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, the average duration of whole fixations distribution shows that the data in patterns one and three are mostly concentrated in a very small range, while patterns two and four are relatively high and the data distribution is relatively scattered, with significant differences between the data, as shown in this Figure).</p>
Full article ">Figure 9
<p>Boxplots of average whole fixation pupil diameter for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, the average whole fixation pupil diameter data shows that the median of Mode 2-4 data is higher, while Mode 1-3 data is lower, as shown in this Figure).</p>
Full article ">Figure 10
<p>Boxplots of number of saccades for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, for number of saccades, the median of the data in Mode 2-4 is low, while Mode 1-3 is high, as shown in this Figure).</p>
Full article ">Figure 11
<p>Hotspot maps of the four models.</p>
Full article ">
26 pages, 13334 KiB  
Article
Generating 3D Models for UAV-Based Detection of Riparian PET Plastic Bottle Waste: Integrating Local Social Media and InstantMesh
by Shijun Pan, Keisuke Yoshida, Daichi Shimoe, Takashi Kojima and Satoshi Nishiyama
Drones 2024, 8(9), 471; https://doi.org/10.3390/drones8090471 - 9 Sep 2024
Viewed by 1164
Abstract
In recent years, waste pollution has become a severe threat to riparian environments worldwide. Along with the advancement of deep learning (DL) algorithms (i.e., object detection models), related techniques have become useful for practical applications. This work attempts to develop a data generation [...] Read more.
In recent years, waste pollution has become a severe threat to riparian environments worldwide. Along with the advancement of deep learning (DL) algorithms (i.e., object detection models), related techniques have become useful for practical applications. This work attempts to develop a data generation approach to generate datasets for small target recognition, especially for recognition in remote sensing images. A relevant point is that similarity between data used for model training and data used for testing is crucially important for object detection model performance. Therefore, obtaining training data with high similarity to the monitored objects is a key objective of this study. Currently, Artificial Intelligence Generated Content (AIGC), such as single target objects generated by Luma AI, is a promising data source for DL-based object detection models. However, most of the training data supporting the generated results are not from Japan. Consequently, the generated data are less similar to monitored objects in Japan, having, for example, different label colors, shapes, and designs. For this study, the authors developed a data generation approach by combining social media (Clean-Up Okayama) and single-image-based 3D model generation algorithms (e.g., InstantMesh) to provide a reliable reference for future generations of localized data. The trained YOLOv8 model in this research, obtained from the S2PS (Similar to Practical Situation) AIGC dataset, produced encouraging results (high F1 scores, approximately 0.9) in scenario-controlled UAV-based riparian PET bottle waste identification tasks. The results of this study show the potential of AIGC to supplement or replace real-world data collection and reduce the on-site work load. Full article
Show Figures

Figure 1

Figure 1
<p>Process of plastics’ entry into the food chain. The lifecycle of plastic waste in aquatic ecosystems includes the following steps: initial river entry; collapse of the oceanic macroplastic to microplastic pollution from the force of the wind/sunshine; bio-magnification through the marine food chain into the human body.</p>
Full article ">Figure 2
<p>On-site waste pollution detected in the Hyakken River, Japan.</p>
Full article ">Figure 3
<p>Bottle-related models were generated by Luma AI GENIE (website page in the <b>upper panel</b>). Derived from the several results, four bottle-related models (the <b>lower panel</b>) were selected, using different prompts.</p>
Full article ">Figure 4
<p>Samples of PET bottle waste derived from txt2img AIGC, on-site, and Luma AI GENIE generations.</p>
Full article ">Figure 5
<p>Clean-Up Okayama website (English content in this figure was derived from image-based Google Translate) that includes four main parts: 1. Total number of participants and the amount of waste picked up in Okayama prefecture; 2. Number of waste items from the whole period distributed in the Okayama prefecture, with mapping derived from Google Maps, Alphabet Inc., Mountain View, CA, USA; 3. Comments and field images collected and uploaded by the website users, with obscured user names and profile logos; 4. Chart of waste collection activities in Okayama prefecture, including number of people and waste by date.</p>
Full article ">Figure 6
<p>Pick-up sample images from section 3 in <a href="#drones-08-00471-f005" class="html-fig">Figure 5</a>.</p>
Full article ">Figure 7
<p>Process of collecting local bottle waste-based objects: 1. Collecting on-site waste; 2. Taking the image; 3. Uploading the image to the website; 4. Generating the 3D model.</p>
Full article ">Figure 8
<p>The upper map displays the locations (icons) of the waste-related image capture and upload. The lower satellite shows the Hyakken River area (both are derived from Google Maps; Alphabet Inc.).</p>
Full article ">Figure 8 Cont.
<p>The upper map displays the locations (icons) of the waste-related image capture and upload. The lower satellite shows the Hyakken River area (both are derived from Google Maps; Alphabet Inc.).</p>
Full article ">Figure 9
<p>InstantMesh model architecture.</p>
Full article ">Figure 10
<p>Process of generating S2PS AIGC using InstantMesh: 1. Inputting the image; 2. Generating the multiple views derived from a single input image; 3. Outputting the GLB/OBJ-formatted S2PS AIGC.</p>
Full article ">Figure 11
<p>Samples of the S2PS AIGC.</p>
Full article ">Figure 12
<p>Process of generating the automatic rotating bottle videos using the autoRotate function in glTF Viewer.</p>
Full article ">Figure 13
<p>Frame images derived from the automatic rotating bottle video.</p>
Full article ">Figure 14
<p>Process of generating specific datasets (one resource image): 1. Selecting one specific frame image; 2. Transparency of the black background; 3. Selecting one drone image derived using a 75° camera angle and 2 cm GSD (ground sample distance) resource images; 4. Adjusting the bottle size from Step 2 to match the bottle size of the drone image in Step 3; 5. Generating the new image with the bottle shown in Step 4 and the drone image in Step 3; 6. Preparing data augmentation for the specific dataset, mainly including image direction change and blur.</p>
Full article ">Figure 15
<p>Process of generating S2PS AIGC Dataset (multiple resource images): 1. Pre-Processing—selecting the background and object images; 2. Setting the parameters—mainly adjusting the image size between the background and the object; 3. Generating multiple resource images.</p>
Full article ">Figure 16
<p>Results of training the model, which include train/valid-based box_loss/cls_loss/dfl_loss, precision/recall, and mAP50/50–95 (one source image, epoch 1000, batch-size 16, patience 50).</p>
Full article ">Figure 17
<p>Samples of the batch images used for training: train_batch 0, train_batch 1, and train_batch 2 (one source image, epoch 1000, batch-size 16, patience 50).</p>
Full article ">Figure 18
<p>Process of selecting the test image: <b>Left panel</b>, using 50 drone images to reconstruct the 3D model; <b>Right panel</b>, zooming in on the screen closer to the object and outputting the image. The described process was performed with open source photogrammetry software called 3DF Zephyr free version, which can create 3D models from photographs. This process is to select the object which has a similar size using the zoom-in function of the 3D model.</p>
Full article ">Figure 19
<p>Relation between inference image size and corresponding confidence value (test 1).</p>
Full article ">Figure 20
<p>Results of training the model (one source image, epoch 10,000, batch-size 16, patience 1000).</p>
Full article ">Figure 21
<p>Validation results (test 2): <b>Left panel</b>, inference with confidence value; <b>Right panel</b>, true label.</p>
Full article ">Figure 22
<p>F1 score derived from the validation results (one source image).</p>
Full article ">Figure 23
<p>Object sizes (<b>left panel</b>, generated resource image; <b>middle panel</b>, a 3D-derived test with the same background; <b>right panel</b>, a similar object test with a similar background).</p>
Full article ">Figure 24
<p>Results of training the model, which include train/valid-based box_loss/cls_loss/dfl_loss, precision/recall and mAP50/50–95 (one source image, epoch 1000, batch-size 256, patience 1000).</p>
Full article ">Figure 25
<p>Samples of batch images used for training: train_batch 0, train_batch 1, train_batch 2, train_batch 19800, train_batch 19801, and train_batch 19802.</p>
Full article ">Figure 26
<p>Validation results: <b>Left panel</b>, inference with confidence value; <b>Right panel</b>, true label.</p>
Full article ">Figure 27
<p>F1 score derived from the validation results (multiple source images).</p>
Full article ">Figure 28
<p>Samples of the 3D waste group generations.</p>
Full article ">Figure 29
<p>Samples of failed 3D model generations.</p>
Full article ">
21 pages, 1350 KiB  
Article
Exploring Consumer Acceptance of AI-Generated Advertisements: From the Perspectives of Perceived Eeriness and Perceived Intelligence
by Chenyan Gu, Shuyue Jia, Jiaying Lai, Ruli Chen and Xinsiyu Chang
J. Theor. Appl. Electron. Commer. Res. 2024, 19(3), 2218-2238; https://doi.org/10.3390/jtaer19030108 - 3 Sep 2024
Viewed by 6435
Abstract
The rapid popularity of ChatGPT has brought generative AI into broad focus. The content generation model represented by AI-generated content (AIGC) has reshaped the advertising industry. This study explores the mechanisms by which the characteristics of AI-generated advertisements affect consumers’ willingness to accept [...] Read more.
The rapid popularity of ChatGPT has brought generative AI into broad focus. The content generation model represented by AI-generated content (AIGC) has reshaped the advertising industry. This study explores the mechanisms by which the characteristics of AI-generated advertisements affect consumers’ willingness to accept these advertisements from the perspectives of perceived eeriness and perceived intelligence. It found that the verisimilitude and imagination of AI-generated advertisements negatively affect the degree of perceived eeriness by consumers, while synthesis positively affects it. Conversely, verisimilitude, vitality, and imagination positively affect the perceived intelligence, while synthesis negatively affects it. Meanwhile, consumers’ perceived eeriness negatively affects their acceptance of AI-generated advertisements, while perceived intelligence positively affects their willingness to accept AI-generated advertisements. This study helps explain consumers’ attitudes toward AI-generated advertisements and offers strategies for brands and advertisers for how to use AI technology more scientifically to optimize advertisements. Advertisers should cautiously assess the possible impact of AI-generated advertisements according to their characteristics, allowing generative AI to play a more valuable role in advertising. Full article
(This article belongs to the Section Digital Marketing and the Connected Consumer)
Show Figures

Figure 1

Figure 1
<p>Research Model and Hypotheses.</p>
Full article ">Figure 2
<p>Conclusion of structural model.</p>
Full article ">
17 pages, 24383 KiB  
Article
Can Stylized Products Generated by AI Better Attract User Attention? Using Eye-Tracking Technology for Research
by Yunjing Tang and Chen Chen
Appl. Sci. 2024, 14(17), 7729; https://doi.org/10.3390/app14177729 - 2 Sep 2024
Viewed by 1507
Abstract
The emergence of AIGC has significantly improved design efficiency, enriched creativity, and promoted innovation in the design industry. However, whether the content generated from its own database meets the preferences of target users still needs to be determined through further testing. This study [...] Read more.
The emergence of AIGC has significantly improved design efficiency, enriched creativity, and promoted innovation in the design industry. However, whether the content generated from its own database meets the preferences of target users still needs to be determined through further testing. This study investigates the appeal of AI-generated stylized products to users, utilizing 12 images as stimuli in conjunction with eye-tracking technology. The stimulus is composed of top-selling gender-based stylized Bluetooth earphones from the Taobao shopping platform and the gender-based stylized earphones generated by the AIGC software GPT4.0, categorized into three experimental groups. An eye-tracking experiment was conducted in which 44 participants (22 males and 22 females, mean age = 21.75, SD = 2.45, range 18–27 years) observed three stimuli groups. The eye movements of the participants were measured while viewing product images. The results indicated that variations in stimuli category and gender caused differences in fixation durations and counts. When presenting a mix of the two types of earphones, the AIGC-generated earphones and earphones from the Taobao shopping platform, the two gender groups both showed a significant effect in fixation duration with F (2, 284) = 3.942, p = 0.020 < 0.05, and η = 0.164 for the female group and F (2, 302) = 8.824, p < 0.001, and η = 0.235 for the male group. They all had a longer fixation duration for the AI-generated earphones. When presenting exclusively the two types of AI-generated gender-based stylized earphones, there was also a significant effect in fixation duration with F (2, 579) = 4.866, p = 0.008 < 0.05, and η = 0.129. The earphones generated for females had a longer fixation duration. Analyzing this dataset from a gender perspective, there was no significant effect when the male participants observed the earphones, with F (2, 304) = 1.312 and p = 0.271, but there was a significant difference in fixation duration when the female participants observed the earphones (F (2, 272) = 4.666, p = 0.010 < 0.05, and η = 0.182). The female participants had a longer fixation duration towards the earphones that the AI generated for females. Full article
Show Figures

Figure 1

Figure 1
<p>The process of searching for earphones on the Taobao shopping platform.</p>
Full article ">Figure 2
<p>Six pairs of earphones obtained from the Taobao shopping platform.</p>
Full article ">Figure 3
<p>The process of generating stylized headphones using the AIGC software GTP4.0.</p>
Full article ">Figure 4
<p>Six pairs of earphones generated by the AIGC software GTP4.0.</p>
Full article ">Figure 5
<p>Three stimuli groups.</p>
Full article ">Figure 6
<p>Experimental instrument.</p>
Full article ">Figure 7
<p>AOIs and numbers of each stimulus.</p>
Full article ">Figure 8
<p>The process of studying whether AI-generated products are more attractive.</p>
Full article ">Figure 9
<p>The heat map for Stimuli group 1.</p>
Full article ">Figure 10
<p>Stimuli group 1: (<b>a</b>) the female participants’ fixation counts for each stimulus; (<b>b</b>) the female participants’ fixation duration for each stimulus; (<b>c</b>) the fixation duration for the two categories among the female participants.</p>
Full article ">Figure 11
<p>The heat map for Stimuli group 2.</p>
Full article ">Figure 12
<p>Stimuli group 2: (<b>a</b>) the male participants’ fixation counts for each stimulus; (<b>b</b>) the male participants’ fixation duration for each stimulus; (<b>c</b>) the fixation duration for the two categories among the male participants.</p>
Full article ">Figure 13
<p>The heat maps for Stimuli group 3.</p>
Full article ">Figure 14
<p>(<b>a</b>) The fixation duration for the two categories among all participants. (<b>b</b>) The fixation duration for the two categories among the female participants.</p>
Full article ">
18 pages, 1620 KiB  
Article
A Study on Teachers’ Willingness to Use Generative AI Technology and Its Influencing Factors: Based on an Integrated Model
by Haili Lu, Lin He, Hao Yu, Tao Pan and Kefeng Fu
Sustainability 2024, 16(16), 7216; https://doi.org/10.3390/su16167216 - 22 Aug 2024
Viewed by 3386
Abstract
The development of new artificial intelligence-generated content (AIGC) technology creates new opportunities for the digital transformation of education. Teachers’ willingness to adopt AIGC technology for collaborative teaching is key to its successful implementation. This study employs the TAM and TPB to construct a [...] Read more.
The development of new artificial intelligence-generated content (AIGC) technology creates new opportunities for the digital transformation of education. Teachers’ willingness to adopt AIGC technology for collaborative teaching is key to its successful implementation. This study employs the TAM and TPB to construct a model analyzing teachers’ acceptance of AIGC technology, focusing on the influencing factors and differences across various educational stages. The study finds that teachers’ behavioral intentions to use AIGC technology are primarily influenced by perceived usefulness, perceived ease of use, behavioral attitudes, and perceived behavioral control. Perceived ease of use affects teachers’ willingness both directly and indirectly across different groups. However, perceived behavioral control and behavioral attitudes only directly influence university teachers’ willingness to use AIGC technology, with the impact of behavioral attitudes being stronger than that of perceived behavioral control. The empirical findings of this study promote the rational use of AIGC technology by teachers, providing guidance for encouraging teachers to actively explore the use of information technology in building new forms of digital education. Full article
Show Figures

Figure 1

Figure 1
<p>Behavioral intention model of teachers’ AIGC technology use behavior based on the TAM–TPB framework.</p>
Full article ">Figure 2
<p>Primary and middle school.</p>
Full article ">Figure 3
<p>Upper secondary school.</p>
Full article ">Figure 4
<p>University.</p>
Full article ">
20 pages, 1720 KiB  
Article
Exploring Designer Trust in Artificial Intelligence-Generated Content: TAM/TPB Model Study
by Shao-Feng Wang and Chun-Ching Chen
Appl. Sci. 2024, 14(16), 6902; https://doi.org/10.3390/app14166902 - 7 Aug 2024
Cited by 3 | Viewed by 2975
Abstract
Traditionally, users have perceived that only manual laborers or those in repetitive jobs would be subject to technological substitution. However, with the emergence of technologies like Midjourney, ChatGPT, and Notion AI, known as Artificial Intelligence-Generated Content (AIGC), we have come to realize that [...] Read more.
Traditionally, users have perceived that only manual laborers or those in repetitive jobs would be subject to technological substitution. However, with the emergence of technologies like Midjourney, ChatGPT, and Notion AI, known as Artificial Intelligence-Generated Content (AIGC), we have come to realize that cognitive laborers, particularly creative designers, also face similar professional challenges. Yet, there has been relatively little research analyzing the acceptance and trust of artificial intelligence from the perspective of designers. This study integrates the TAM/TPB behavioral measurement model, incorporating intrinsic characteristics of designers, to delineate their perceived risks of AIGC into functional and emotional dimensions. It explores how these perceived characteristics, risks, and trust influence designers’ behavioral intentions, employing structural equation modeling for validation. The findings reveal the following: (1) designer trust is the primary factor influencing their behavioral choices; (2) different dimensions of perceived risks have varying degrees of impact on trust, with functional risks significantly positively affecting trust compared to emotional risks; (3) only by enhancing the transparency and credibility of Artificial Intelligence-Generated Content (AIGC) can the perceived characteristics of designers be elevated; and (4) only by effectively safeguarding designers’ legitimate rights and interests can perceived risks be significantly reduced, thereby enhancing trust and subsequently prompting actual behavioral intentions. This study not only enhances the applicability and suitability of AIGC across various industries but also provides evidence for the feasibility of intelligent design in the creative design industry, facilitating the transition of AIGC to Artificial Intelligence-Generated Design (AIGD) for industrial upgrading. Full article
Show Figures

Figure 1

Figure 1
<p>Principles and application processes of design thinking.</p>
Full article ">Figure 2
<p>The proposed conceptual model.</p>
Full article ">Figure 3
<p>Results of model analysis.</p>
Full article ">Figure 4
<p>AIGC Design Thinking and Flowchart.</p>
Full article ">
24 pages, 10810 KiB  
Article
Exploring the Dual Potential of Artificial Intelligence-Generated Content in the Esthetic Reproduction and Sustainable Innovative Design of Ming-Style Furniture
by Yali Wang, Yuchen Xi, Xinxiong Liu and Yan Gan
Sustainability 2024, 16(12), 5173; https://doi.org/10.3390/su16125173 - 18 Jun 2024
Cited by 3 | Viewed by 2386
Abstract
The present research aims to explore the dual potential of artificial intelligence-generated content (AIGC) technology in the esthetic reproduction of Ming-style furniture and its innovative design while promoting sustainable practices and cultural heritage preservation. For this purpose, a combination of methodologies integrating the [...] Read more.
The present research aims to explore the dual potential of artificial intelligence-generated content (AIGC) technology in the esthetic reproduction of Ming-style furniture and its innovative design while promoting sustainable practices and cultural heritage preservation. For this purpose, a combination of methodologies integrating the principles of grounded theory, empirical research, sustainable design, and design practice and evaluation techniques is employed. The results are as follows: First, the three-level coding method in grounded theory is used to construct a multi-dimensional esthetic feature library of Ming-style furniture, including 6 esthetic feature dimensions and 102 groups of esthetic elements. Second, a set of databases specifically for Ming-style furniture is developed based on the Midjourney platform. The AIGC exclusive toolkit for furniture (MFMP) contains a language package of 61 keywords and a basic formula for Ming-style furniture design. The MFMP toolkit accurately reproduces Ming-style furniture esthetics through empirical validation. Finally, combined with sustainable design principles, a new path is explored in order to utilize the MFMP toolkit for the sustainable and innovative design of new Chinese-style furniture. The research results demonstrate that AIGC enhances traditional and modern furniture design, offering tools for industry growth in a sustainable way and preserving cultural heritage. Full article
(This article belongs to the Section Tourism, Culture, and Heritage)
Show Figures

Figure 1

Figure 1
<p>Basic research framework based on the double diamond model.</p>
Full article ">Figure 2
<p>Flowchart of the grounded theory research process.</p>
Full article ">Figure 3
<p>Constructing the Ming-style table furniture esthetics database.</p>
Full article ">Figure 4
<p>Experimental testing process of language package for esthetic characteristics of Ming-style furniture.</p>
Full article ">Figure 5
<p>Ming-style table and case furniture AIGC esthetic characteristics language package.</p>
Full article ">Figure 6
<p>AIGC esthetic characteristics language package for Ming-style table furniture.</p>
Full article ">Figure 7
<p>Innovative design process of new Chinese-style furniture.</p>
Full article ">Figure 8
<p>New Chinese-style “bamboo forest table” concept design plan A and plan B.</p>
Full article ">
20 pages, 7046 KiB  
Article
Knowledge-Driven and Diffusion Model-Based Methods for Generating Historical Building Facades: A Case Study of Traditional Minnan Residences in China
by Sirui Xu, Jiaxin Zhang and Yunqin Li
Information 2024, 15(6), 344; https://doi.org/10.3390/info15060344 - 11 Jun 2024
Cited by 2 | Viewed by 1686
Abstract
The preservation of historical traditional architectural ensembles faces multifaceted challenges, and the need for facade renovation and updates has become increasingly prominent. In conventional architectural updating and renovation processes, assessing design schemes and the redesigning component are often time-consuming and labor-intensive. The knowledge-driven [...] Read more.
The preservation of historical traditional architectural ensembles faces multifaceted challenges, and the need for facade renovation and updates has become increasingly prominent. In conventional architectural updating and renovation processes, assessing design schemes and the redesigning component are often time-consuming and labor-intensive. The knowledge-driven method utilizes a wide range of knowledge resources, such as historical documents, architectural drawings, and photographs, commonly used to guide and optimize the conservation, restoration, and management of architectural heritage. Recently, the emergence of artificial intelligence-generated content (AIGC) technologies has provided new solutions for creating architectural facades, introducing a new research paradigm to the renovation plans for historic districts with their variety of options and high efficiency. In this study, we propose a workflow combining Grasshopper with Stable Diffusion: starting with Grasshopper to generate concise line drawings, then using the ControlNet and low-rank adaptation (LoRA) models to produce images of traditional Minnan architectural facades, allowing designers to quickly preview and modify the facade designs during the renovation of traditional architectural clusters. Our research results demonstrate Stable Diffusion’s precise understanding and execution ability concerning architectural facade elements, capable of generating regional traditional architectural facades that meet architects’ requirements for style, size, and form based on existing images and prompt descriptions, revealing the immense potential for application in the renovation of traditional architectural groups and historic districts. It should be noted that the correlation between specific architectural images and proprietary term prompts still requires further addition due to the limitations of the database. Although the model generally performs well when trained on traditional Chinese ancient buildings, the accuracy and clarity of more complex decorative parts still need enhancement, necessitating further exploration of solutions for handling facade details in the future. Full article
(This article belongs to the Special Issue AI Applications in Construction and Infrastructure)
Show Figures

Figure 1

Figure 1
<p>Each part of a Minnan residence.</p>
Full article ">Figure 2
<p>Methodology workflow.</p>
Full article ">Figure 3
<p>Image collection.</p>
Full article ">Figure 4
<p>Image tags.</p>
Full article ">Figure 5
<p>Structure of elements of the facade of Minnan residences.</p>
Full article ">Figure 6
<p>The flow diagram of the architecture facade generation.</p>
Full article ">Figure 7
<p>Component sequence diagrams for different positions of line draft (the component in the red frame is the output part).</p>
Full article ">Figure 8
<p>ControlNet canny model processing: (<b>a</b>) the line draft generated by Grasshopper; (<b>b</b>) the line draft processed by the ControlNet canny model.</p>
Full article ">Figure 9
<p>Training results of 2LoRA models.</p>
Full article ">Figure 10
<p>Radar images of qualitative evaluation.</p>
Full article ">Figure 11
<p>CLIP scores across pictures generated by different LoRAs.</p>
Full article ">
11 pages, 13153 KiB  
Article
Image Steganography and Style Transformation Based on Generative Adversarial Network
by Li Li, Xinpeng Zhang, Kejiang Chen, Guorui Feng, Deyang Wu and Weiming Zhang
Mathematics 2024, 12(4), 615; https://doi.org/10.3390/math12040615 - 19 Feb 2024
Cited by 2 | Viewed by 2431
Abstract
Traditional image steganography conceals secret messages in unprocessed natural images by modifying the pixel value, causing the obtained stego to be different from the original image in terms of the statistical distribution; thereby, it can be detected by a well-trained classifier for steganalysis. [...] Read more.
Traditional image steganography conceals secret messages in unprocessed natural images by modifying the pixel value, causing the obtained stego to be different from the original image in terms of the statistical distribution; thereby, it can be detected by a well-trained classifier for steganalysis. To ensure the steganography is imperceptible and in line with the trend of art images produced by Artificial-Intelligence-Generated Content (AIGC) becoming popular on social networks, this paper proposes to embed hidden information throughout the process of the generation of an art-style image by designing an image-style-transformation neural network with a steganography function. The proposed scheme takes a content image, an art-style image, and messages to be embedded as inputs, processing them with an encoder–decoder model, and finally, generates a styled image containing the secret messages at the same time. An adversarial training technique was applied to enhance the imperceptibility of the generated art-style stego image from plain-style-transferred images. The lack of the original cover image makes it difficult for the opponent learning steganalyzer to identify the stego. The proposed approach can successfully withstand existing steganalysis techniques and attain the embedding capacity of three bits per pixel for a color image, according to the experimental results. Full article
(This article belongs to the Special Issue Representation Learning for Computer Vision and Pattern Recognition)
Show Figures

Figure 1

Figure 1
<p>Comparison with traditional image steganography and style transfer steganography.</p>
Full article ">Figure 2
<p>Framework of hiding information in style transform network.</p>
Full article ">Figure 3
<p>Comparison of clean style-transferred images without steganography (columns (<b>c</b>,<b>d</b>)) and stego style-transferred images (columns (<b>e</b>,<b>f</b>)).</p>
Full article ">Figure 4
<p>Residual of the style transferred image with and without secret information: <math display="inline"><semantics> <msub> <mi>C</mi> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mn>1</mn> </mrow> </msub> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>C</mi> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mn>2</mn> </mrow> </msub> </msub> </semantics></math> respectively referred to the style transferred image generated by the clean model <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mn>1</mn> </mrow> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>c</mi> <mn>2</mn> </mrow> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <msub> <mi>M</mi> <mrow> <mi>s</mi> <mn>1</mn> </mrow> </msub> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <msub> <mi>M</mi> <mrow> <mi>s</mi> <mn>2</mn> </mrow> </msub> </msub> </semantics></math> respectively referred to the style transferred image with secret message generated by the steganography model <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>s</mi> <mn>1</mn> </mrow> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>M</mi> <mrow> <mi>s</mi> <mn>2</mn> </mrow> </msub> </semantics></math>.</p>
Full article ">
Back to TopTop