[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (141)

Search Parameters:
Keywords = Gabor filtering

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 14388 KiB  
Article
Adaptive Matching of High-Frequency Infrared Sea Surface Images Using a Phase-Consistency Model
by Xiangyu Li, Jie Chen, Jianwei Li, Zhentao Yu and Yaxun Zhang
Sensors 2025, 25(5), 1607; https://doi.org/10.3390/s25051607 - 6 Mar 2025
Viewed by 193
Abstract
The sea surface displays dynamic characteristics, such as waves and various formations. As a result, images of the sea surface usually have few stable feature points, with a background that is often complex and variable. Moreover, the sea surface undergoes significant changes due [...] Read more.
The sea surface displays dynamic characteristics, such as waves and various formations. As a result, images of the sea surface usually have few stable feature points, with a background that is often complex and variable. Moreover, the sea surface undergoes significant changes due to variations in wind speed, lighting conditions, weather, and other environmental factors, resulting in considerable discrepancies between images. These variations present challenges for identification using traditional methods. This paper introduces an algorithm based on the phase-consistency model. We utilize image data collected from a specific maritime area with a high-frame-rate surface array infrared camera. By accurately detecting images with identical names, we focus on the subtle texture information of the sea surface and its rotational invariance, enhancing the accuracy and robustness of the matching algorithm. We begin by constructing a nonlinear scale space using a nonlinear diffusion method. Maximum and minimum moments are generated using an odd symmetric Log–Gabor filter within the two-dimensional phase-consistency model. Next, we identify extremum points in the anisotropic weighted moment space. We use the phase-consistency feature values as image gradient features and develop feature descriptors based on the Log–Gabor filter that are insensitive to scale and rotation. Finally, we employ Euclidean distance as the similarity measure for initial matching, align the feature descriptors, and remove false matches using the fast sample consensus (FSC) algorithm. Our findings indicate that the proposed algorithm significantly improves upon traditional feature-matching methods in overall efficacy. Specifically, the average number of matching points for long-wave infrared images is 1147, while for mid-wave infrared images, it increases to 8241. Additionally, the root mean square error (RMSE) fluctuations for both image types remain stable, averaging 1.5. The proposed algorithm also enhances the rotation invariance of image matching, achieving satisfactory results even at significant rotation angles. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

Figure 1
<p>Workflow of matching algorithm in this paper.</p>
Full article ">Figure 2
<p>The anisotropic weighted moment map: (<b>a</b>) Long-wave infrared image; (<b>b</b>) Medium-wave infrared image.</p>
Full article ">Figure 3
<p>Feature point detection results of anisotropic weighted moment diagram. (<b>a</b>) long-wave infrared image (<b>b</b>) medium-wave infrared image.</p>
Full article ">Figure 4
<p>Feature point detection results of the original image. (<b>a</b>) long-wave infrared image (<b>b</b>) medium-wave infrared image.</p>
Full article ">Figure 5
<p>Descriptor generation flowchart.</p>
Full article ">Figure 6
<p>Part of remote sensing images.</p>
Full article ">Figure 7
<p>Matching results of long-wave infrared images based on five methods. (<b>a</b>) SIFT; (<b>b</b>) SURF; (<b>c</b>) ORB; (<b>d</b>) HAPCG; (<b>e</b>) Textual algorithm.</p>
Full article ">Figure 8
<p>Matching results of medium-wave infrared images based on five methods. (<b>a</b>) SIFT; (<b>b</b>) SURF; (<b>c</b>) ORB; (<b>d</b>) HAPCG; (<b>e</b>) Textual algorithm.</p>
Full article ">Figure 9
<p>Matching results of long wave based on the textual algorithm.</p>
Full article ">Figure 10
<p>Matching results of medium wave based on the textual algorithm.</p>
Full article ">Figure 11
<p>Results of several indicators of long wave.</p>
Full article ">Figure 12
<p>Results of several indicators of medium wave.</p>
Full article ">Figure 13
<p>Matching results under different rotation differences of the textual algorithm. (<b>a</b>) 30 degrees; (<b>b</b>) 60 degrees; (<b>c</b>) 90 degrees; (<b>d</b>) 120 degrees; (<b>e</b>) 150 degrees; (<b>f</b>) 180 degrees; (<b>g</b>) 210 degrees; (<b>h</b>) 240 degrees; (<b>i</b>) 270 degrees; (<b>j</b>) 300 degrees.</p>
Full article ">Figure 14
<p>Result of NCM of the rotated image.</p>
Full article ">Figure 15
<p>Result of RMSE of the rotated image.</p>
Full article ">
22 pages, 6239 KiB  
Article
Fine-Grained Aircraft Recognition Based on Dynamic Feature Synthesis and Contrastive Learning
by Huiyao Wan, Pazlat Nurmamat, Jie Chen, Yice Cao, Shuai Wang, Yan Zhang and Zhixiang Huang
Remote Sens. 2025, 17(5), 768; https://doi.org/10.3390/rs17050768 - 23 Feb 2025
Viewed by 305
Abstract
With the rapid development of deep learning, significant progress has been made in remote sensing image target detection. However, methods based on deep learning are confronted with several challenges: (1) the inherent limitations of activation functions and downsampling operations in convolutional networks lead [...] Read more.
With the rapid development of deep learning, significant progress has been made in remote sensing image target detection. However, methods based on deep learning are confronted with several challenges: (1) the inherent limitations of activation functions and downsampling operations in convolutional networks lead to frequency deviations and loss of local detail information, affecting fine-grained object recognition; (2) class imbalance and long-tail distributions further degrade the performance of minority categories; (3) large intra-class variations and small inter-class differences make it difficult for traditional deep learning methods to effectively extract fine-grained discriminative features. To address these issues, we propose a novel remote sensing aircraft recognition method. First, to mitigate the loss of local detail information, we introduce a learnable Gabor filter-based texture feature extractor, which enhances the discriminative feature representation of aircraft categories by capturing detailed texture information. Second, to tackle the long-tail distribution problem, we design a dynamic feature hallucination module that synthesizes diverse hallucinated samples, thereby improving the feature diversity of tail categories. Finally, to handle the challenge of large intra-class variations and small inter-class differences, we propose a contrastive learning module to enhance the spatial discriminative features of the targets. Extensive experiments on the large-scale fine-grained datasets FAIR1M and MAR20 demonstrate the effectiveness of our method, achieving detection accuracies of 53.56% and 89.72%, respectively, and surpassing state-of-the-art performance. The experimental results validate that our approach effectively addresses the key challenges in remote sensing aircraft recognition. Full article
(This article belongs to the Special Issue Efficient Object Detection Based on Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) The number of aircraft in each class of the MAR20 dataset. (<b>b</b>) The number of aircraft in each class of the FAIR1M dataset. (<b>c</b>) The 10 types of aircraft in the FAIR1M dataset. The long-tail distribution and fine-grained categories make object detection a more challenging task.</p>
Full article ">Figure 2
<p>Overall architecture of our proposed method.</p>
Full article ">Figure 3
<p>The structure of the learnable Gabor filter. The learnable Gabor filter is implemented through deep learning frameworks, allowing the filter’s parameters to be automatically adjusted during the training process to enhance the model’s ability to extract fine-grained features.</p>
Full article ">Figure 4
<p>Schematic diagram of the learnable histogram operator.</p>
Full article ">Figure 5
<p>Schematic diagram of the feature aggregation module.</p>
Full article ">Figure 6
<p>Detection visualization results of RTMDet, DIMA, and our method on the FAIR1M dataset.</p>
Full article ">Figure 7
<p>Detection visualization results of RTMDet, ReDet, and our method on the MAR20 dataset.</p>
Full article ">Figure 8
<p>Visualization of features from the four stages of FPN on datasets FAIR1M and MAR20.</p>
Full article ">Figure 9
<p>Visualization of features extracted by the learnable Gabor filters.</p>
Full article ">Figure 10
<p>Visualization of 128 learnable Gabor filter convolution kernels on the FAIR1M dataset.</p>
Full article ">Figure 11
<p>Confusion matrices of Oriented R-CNN and our method on the FAIR1M and MAR20 test datasets.</p>
Full article ">
21 pages, 3281 KiB  
Article
Multi-Space Feature Fusion and Entropy-Based Metrics for Underwater Image Quality Assessment
by Baozhen Du, Hongwei Ying, Jiahao Zhang and Qunxin Chen
Entropy 2025, 27(2), 173; https://doi.org/10.3390/e27020173 - 6 Feb 2025
Viewed by 499
Abstract
In marine remote sensing, underwater images play an indispensable role in ocean exploration, owing to their richness in information and intuitiveness. However, underwater images often encounter issues such as color shifts, loss of detail, and reduced clarity, leading to the decline of image [...] Read more.
In marine remote sensing, underwater images play an indispensable role in ocean exploration, owing to their richness in information and intuitiveness. However, underwater images often encounter issues such as color shifts, loss of detail, and reduced clarity, leading to the decline of image quality. Therefore, it is critical to study precise and efficient methods for assessing underwater image quality. A no-reference multi-space feature fusion and entropy-based metrics for underwater image quality assessment (MFEM-UIQA) are proposed in this paper. Considering the color shifts of underwater images, the chrominance difference map is created from the chrominance space and statistical features are extracted. Moreover, considering the information representation capability of entropy, entropy-based multi-channel mutual information features are extracted to further characterize chrominance features. For the luminance space features, contrast features from luminance images based on gamma correction and luminance uniformity features are extracted. In addition, logarithmic Gabor filtering is applied to the luminance space images for subband decomposition and entropy-based mutual information of subbands is captured. Furthermore, underwater image noise features, multi-channel dispersion information, and visibility features are extracted to jointly represent the perceptual features. The experiments demonstrate that the proposed MFEM-UIQA surpasses the state-of-the-art methods. Full article
(This article belongs to the Collection Entropy in Image Analysis)
Show Figures

Figure 1

Figure 1
<p>The Framework of MFEM-UIQA.</p>
Full article ">Figure 2
<p>Underwater images and corresponding UCD maps. (<b>a</b>) Underwater images of different quality levels; (<b>b</b>) corresponding UCD maps.</p>
Full article ">Figure 3
<p>Comparison of statistical distribution of MSCN coefficients for original underwater images and corresponding <span class="html-italic">Ψ<sub>D</sub></span>. (<b>a</b>) The statistical distribution of MSCN coefficients for the original underwater images, and (<b>b</b>) the statistical distribution of MSCN coefficients for <span class="html-italic">Ψ<sub>D</sub></span>.</p>
Full article ">Figure 4
<p>Underwater images of different quality levels and the corresponding fitting Rayleigh distribution shape parameter. (<b>a</b>) Underwater images of different quality levels; (<b>b</b>) fitting Rayleigh distribution shape parameters corresponding to three channel histograms of the OC space.</p>
Full article ">Figure 5
<p>Non-uniform brightness image and its block map. (<b>a</b>) Non-uniform brightness underwater image; (<b>b</b>) block map of (<b>a</b>).</p>
Full article ">Figure 6
<p>Underwater images with differing quality and corresponding K-L divergence distribution. (<b>a</b>) Underwater images of different quality levels; (<b>b</b>) the K-L divergence distribution of three channels in the OC space.</p>
Full article ">Figure 7
<p>Different quality underwater images and corresponding visibility values.</p>
Full article ">
30 pages, 24013 KiB  
Article
Non-Concentric Differential Model with Geographic Information-Driven Weights Allocation for Enhanced Infrared Small Target Detection
by Lingbing Peng, Zhi Lu, Tao Lei and Ping Jiang
Remote Sens. 2025, 17(1), 75; https://doi.org/10.3390/rs17010075 - 28 Dec 2024
Viewed by 427
Abstract
Infrared small target detection technology has received extensive attention due to its advantages in long-distance monitoring. However, there is much room for improvement in its performance due to complex backgrounds and the lack of distinct features in small targets. Many specific scenarios can [...] Read more.
Infrared small target detection technology has received extensive attention due to its advantages in long-distance monitoring. However, there is much room for improvement in its performance due to complex backgrounds and the lack of distinct features in small targets. Many specific scenarios can lead to target loss, such as edge-adjacent targets, intersecting targets, low contrast caused by locally bright backgrounds, and false alarms induced by globally bright backgrounds. To address these issues, we have identified the positional correlation differences between the local background location and whether the target can be perceived by the human eye, thereby introducing geographic information weights to represent this correlation difference. We first constructed a non-concentric Gaussian difference structure to prevent the central target energy loss caused by traditional concentric filters. Based on this, we introduced Gabor filters, which have the capability of directional feature extraction and position correlation representation, into the non-concentric differential structure. By adjusting the relative position of the Gabor filter center and configuring frequency parameters based on geographic information, we optimized the filter weights to handle complex situations, such as targets being close to background clutter or other targets. Subsequently, an improved logarithmic function was applied to adjust the overall saliency of candidate targets, preventing the loss of low-contrast targets and the residual high-energy background clutter. Extensive experiments show that our method exhibits effective detection performance and robustness in four application scenes and three challenging image distribution scenes. Full article
Show Figures

Figure 1

Figure 1
<p>Flowchart of the method proposed in this paper.</p>
Full article ">Figure 2
<p>Schematic diagram of three differential structures in a plane. (<b>a</b>) DoG, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>. (<b>b</b>) iDoGb, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>0.6</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>1.2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>4</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>. (<b>c</b>) DoGaGb, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>1.4</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 3
<p>The DoGaGb model under different orientations. <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>1.4</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>. (<b>a</b>) <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>. (<b>b</b>) <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mi>π</mi> <mo>/</mo> <mn>4</mn> </mrow> </semantics></math>. (<b>c</b>) <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>3</mn> <mi>π</mi> <mo>/</mo> <mn>4</mn> </mrow> </semantics></math>. (<b>d</b>) <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mi>π</mi> <mo>/</mo> <mn>2</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 4
<p>Energy distribution of concentric Gaussian differences.</p>
Full article ">Figure 5
<p>3D display of three concentric differential models. (<b>a</b>) DoG, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>. (<b>b</b>) DoGb, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>0.6</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>1.2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>4</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>. (<b>c</b>) DoGaGb, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>1.4</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>σ</mi> <mn>3</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 6
<p>Schematic diagram of a small target against a homogeneous background and a small target located at the edge of bright clutter. (<b>a</b>) homogeneous background (<b>b</b>) the edge of bright clutter.</p>
Full article ">Figure 7
<p>Energy distribution of concentric one-dimensional standard Gaussian differences under different background parameters <math display="inline"><semantics> <msub> <mi>μ</mi> <mn>2</mn> </msub> </semantics></math>. (<b>a</b>) represents the energy distribution when the parameter <math display="inline"><semantics> <msub> <mi>μ</mi> <mn>2</mn> </msub> </semantics></math> = 1, (<b>b</b>) shows the case when <math display="inline"><semantics> <mrow> <msub> <mi>μ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>c</b>) illustrates the distribution for <math display="inline"><semantics> <mrow> <msub> <mi>μ</mi> <mn>2</mn> </msub> <mo>=</mo> <mn>5</mn> </mrow> </semantics></math>. The shaded areas indicate the target energy (red) and background energy (green).</p>
Full article ">Figure 8
<p>Diagrams of several natural logarithm functions.</p>
Full article ">Figure 9
<p>Illustration of the effectiveness of feature normalization. (<b>a</b>) Original Image; (<b>b</b>) Experimental result without logarithmic function normalization; (<b>c</b>) Experimental result with logarithmic function normalization.</p>
Full article ">Figure 10
<p>Diagram of the target and its surrounding background.</p>
Full article ">Figure 11
<p>Detection performance of different algorithms in sky scenes. Examples 1–5 correspond to different test cases. The arrow in the 3D plot shows the target position, the rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 12
<p>Detection performance of different algorithms in sea-sky scenes. Examples 1–5 correspond to different test cases. The arrow in the 3D plot shows the target position, the rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 13
<p>Detection performance of different algorithms in urban scenes. Examples 1–5 correspond to different test cases. The arrow in the 3D plot shows the target position, the rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 14
<p>Detection performance of different methods in suburban scenes. Examples 1–5 correspond to different test cases. The arrow in the 3D plot shows the target position, the rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 15
<p>Detection performance of different methods for targets at the edge of background clutter. Examples 1–4 correspond to different test cases. The rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 16
<p>Detection performance of different algorithms for targets at the edge of background clutter. Examples 5–8 correspond to different test cases. The rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 17
<p>Detection performance of different algorithms for targets at the edge of background clutter. Examples 9–12 correspond to different test cases. The rectangle in the grayscale image marks the target location, and the red circle represents the ground truth target position in the detection results.</p>
Full article ">Figure 18
<p>The detection and segmentation results of the proposed method in single-target scenes. Numbers 1–6 represent different test cases, while (<b>a</b>–<b>f</b>) correspond to various steps in the detection process. The rectangle in the grayscale image (<b>a</b>,<b>d</b>) indicates the location of the target, and the red circle in the 3D plot (<b>b</b>,<b>e</b>) marks the ground truth target position in the detection results.</p>
Full article ">Figure 19
<p>The detection and segmentation results of the proposed method in multi-target scenes. Numbers 7–12 represent different test cases, while (<b>a</b>–<b>f</b>) correspond to various steps in the detection process. The rectangle in the grayscale image (<b>a</b>,<b>d</b>) indicates the location of the target, and the red circle in the 3D plot (<b>b</b>,<b>e</b>) marks the ground truth target position in the detection results.</p>
Full article ">Figure 20
<p>ROC curve of single-frame dataset 1.</p>
Full article ">Figure 21
<p>ROC curves for 6 groups of test sequences. (<b>a</b>–<b>f</b>) correspond to Seq 1–Seq 6, respectively.</p>
Full article ">
17 pages, 6702 KiB  
Article
A Variational Neural Network Based on Algorithm Unfolding for Image Blind Deblurring
by Shaoqing Gong, Yeran Wang, Guangyu Yang, Weibo Wei, Junli Zhao and Zhenkuan Pan
Appl. Sci. 2024, 14(24), 11742; https://doi.org/10.3390/app142411742 - 16 Dec 2024
Viewed by 688
Abstract
Image blind deblurring is an ill-posed inverse problem in image processing. While deep learning approaches have demonstrated effectiveness, they often lack interpretability and require extensive data. To address these limitations, we propose a novel variational neural network based on algorithm unfolding. The model [...] Read more.
Image blind deblurring is an ill-posed inverse problem in image processing. While deep learning approaches have demonstrated effectiveness, they often lack interpretability and require extensive data. To address these limitations, we propose a novel variational neural network based on algorithm unfolding. The model is solved using the half quadratic splitting (HQS) method and proximal gradient descent. For blur kernel estimation, we introduce an L0 regularizer to constrain the gradient information and use the fast fourier transform (FFT) to solve the iterative results, thereby improving accuracy. Image restoration is initiated with Gabor filters for the convolution kernel, and the activation function is approximated using a Gaussian radial basis function (RBF). Additionally, two attention mechanisms improve feature selection. The experimental results on various datasets demonstrate that our model outperforms state-of-the-art algorithm unfolding networks and other blind deblurring models. Our approach enhances interpretability and generalization while utilizing fewer data and parameters. Full article
Show Figures

Figure 1

Figure 1
<p>The flowchart of the algorithm unfolding.</p>
Full article ">Figure 2
<p>The structure of the variational neural network. The FFT-1 and FFT-2 correspond to Step 4 and Step 7 in Algorithm 1, respectively.</p>
Full article ">Figure 3
<p>Examples of different models for removing linear motion blur. (<b>a</b>) Groundtruth (<b>b</b>) DUBLID [<a href="#B22-applsci-14-11742" class="html-bibr">22</a>] (<b>c</b>) DeblurGAN [<a href="#B8-applsci-14-11742" class="html-bibr">8</a>] (<b>d</b>) DeblurGAN-v2 [<a href="#B24-applsci-14-11742" class="html-bibr">24</a>] (<b>e</b>) DeepDeblur [<a href="#B6-applsci-14-11742" class="html-bibr">6</a>] (<b>f</b>) SRN [<a href="#B7-applsci-14-11742" class="html-bibr">7</a>] (<b>g</b>) SFNet [<a href="#B25-applsci-14-11742" class="html-bibr">25</a>] (<b>h</b>) Ours.</p>
Full article ">Figure 4
<p>Examples of different models for nonlinear motion blur. (<b>a</b>) Groundtruth (<b>b</b>) DUBLID [<a href="#B22-applsci-14-11742" class="html-bibr">22</a>] (<b>c</b>) DeblurGAN [<a href="#B8-applsci-14-11742" class="html-bibr">8</a>] (<b>d</b>) DeblurGAN-v2 [<a href="#B24-applsci-14-11742" class="html-bibr">24</a>] (<b>e</b>) DeepDeblur [<a href="#B6-applsci-14-11742" class="html-bibr">6</a>] (<b>f</b>) SRN [<a href="#B7-applsci-14-11742" class="html-bibr">7</a>] (<b>g</b>) SFNet [<a href="#B25-applsci-14-11742" class="html-bibr">25</a>] (<b>h</b>) Ours.</p>
Full article ">Figure 5
<p>Examples of different models for the GoPro dataset. (<b>a</b>) Groundtruth (<b>b</b>) DUBLID [<a href="#B22-applsci-14-11742" class="html-bibr">22</a>] (<b>c</b>) DeblurGAN [<a href="#B8-applsci-14-11742" class="html-bibr">8</a>] (<b>d</b>) DeblurGAN-v2 [<a href="#B24-applsci-14-11742" class="html-bibr">24</a>] (<b>e</b>) DeepDeblur [<a href="#B6-applsci-14-11742" class="html-bibr">6</a>] (<b>f</b>) SRN [<a href="#B7-applsci-14-11742" class="html-bibr">7</a>] (<b>g</b>) SFNet [<a href="#B25-applsci-14-11742" class="html-bibr">25</a>] (<b>h</b>) Ours.</p>
Full article ">Figure 6
<p>Examples of different models for the Lai dataset. (<b>a</b>) Blurred (<b>b</b>) DUBLID [<a href="#B22-applsci-14-11742" class="html-bibr">22</a>] (<b>c</b>) DeblurGAN [<a href="#B8-applsci-14-11742" class="html-bibr">8</a>] (<b>d</b>) DeblurGAN-v2 [<a href="#B24-applsci-14-11742" class="html-bibr">24</a>] (<b>e</b>) DeepDeblur [<a href="#B6-applsci-14-11742" class="html-bibr">6</a>] (<b>f</b>) SRN [<a href="#B7-applsci-14-11742" class="html-bibr">7</a>] (<b>g</b>) SFNet [<a href="#B25-applsci-14-11742" class="html-bibr">25</a>] (<b>h</b>) Ours.</p>
Full article ">Figure 7
<p>Experimental results of whether to use <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mn>0</mn> </mrow> </msub> </mrow> </semantics></math> regularizer.</p>
Full article ">Figure 8
<p>Blur kernel obtained by different regularizers.</p>
Full article ">
28 pages, 7535 KiB  
Article
A New Computer-Aided Diagnosis System for Breast Cancer Detection from Thermograms Using Metaheuristic Algorithms and Explainable AI
by Hanane Dihmani, Abdelmajid Bousselham and Omar Bouattane
Algorithms 2024, 17(10), 462; https://doi.org/10.3390/a17100462 - 18 Oct 2024
Viewed by 1661
Abstract
Advances in the early detection of breast cancer and treatment improvements have significantly increased survival rates. Traditional screening methods, including mammography, MRI, ultrasound, and biopsies, while effective, often come with high costs and risks. Recently, thermal imaging has gained attention due to its [...] Read more.
Advances in the early detection of breast cancer and treatment improvements have significantly increased survival rates. Traditional screening methods, including mammography, MRI, ultrasound, and biopsies, while effective, often come with high costs and risks. Recently, thermal imaging has gained attention due to its minimal risks compared to mammography, although it is not widely adopted as a primary detection tool since it depends on identifying skin temperature changes and lesions. The advent of machine learning (ML) and deep learning (DL) has enhanced the effectiveness of breast cancer detection and diagnosis using this technology. In this study, a novel interpretable computer aided diagnosis (CAD) system for breast cancer detection is proposed, leveraging Explainable Artificial Intelligence (XAI) throughout its various phases. To achieve these goals, we proposed a new multi-objective optimization approach named the Hybrid Particle Swarm Optimization algorithm (HPSO) and Hybrid Spider Monkey Optimization algorithm (HSMO). These algorithms simultaneously combined the continuous and binary representations of PSO and SMO to effectively manage trade-offs between accuracy, feature selection, and hyperparameter tuning. We evaluated several CAD models and investigated the impact of handcrafted methods such as Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG), Gabor Filters, and Edge Detection. We further shed light on the effect of feature selection and optimization on feature attribution and model decision-making processes using the SHapley Additive exPlanations (SHAP) framework, with a particular emphasis on cancer classification using the DMR-IR dataset. The results of our experiments demonstrate in all trials that the performance of the model is improved. With HSMO, our models achieved an accuracy of 98.27% and F1-score of 98.15% while selecting only 25.78% of the HOG features. This approach not only boosts the performance of CAD models but also ensures comprehensive interpretability. This method emerges as a promising and transparent tool for early breast cancer diagnosis. Full article
Show Figures

Figure 1

Figure 1
<p>The evolution and role of XAI in decision making by stakeholders.</p>
Full article ">Figure 2
<p>Pipeline representation.</p>
Full article ">Figure 3
<p>Process of gathering IDs and division into train–test sets in DMR-IR database.</p>
Full article ">Figure 4
<p>Sample feature extract image and their distributions (<b>a</b>) original cropped image: top left healthy sample and bottom left sick sample; (<b>b</b>) LBP; (<b>c</b>) Canny edges; (<b>d</b>) HOG; and (<b>e</b>) Gabor filter.</p>
Full article ">Figure 5
<p>(<b>a</b>) Original cropped image; (<b>b</b>) feature extract image: top using canny edges, middle LBP extractor and bottom using HOG; and (<b>c</b>): selected features mapped back to extracted features.</p>
Full article ">Figure 6
<p>Example feature selection; top: extracted features using LBP extractor; bottom: selected features using HSMO.</p>
Full article ">Figure 7
<p>Example of explaining prediction using SHAP plot: the left side after applying HSMO optimization; the right side without optimization, both using an HOG feature extractor. The X-axis represents the value of Feature 4, and the Y-axis shows its SHAP value, indicating its contribution to the model’s prediction. The color gradient represents secondary features (Feature 21 in the left plot and Feature 25 in the right plot), showing their interaction with Feature 4. The optimized model (left) displays a clearer and more structured relationship between features.</p>
Full article ">Figure 8
<p>Optimization process flow. Solid Arrows represent the flow of the process between the stages of the CAD system, while Dashed Boxes represent the Hybrid Optimization Scheme and the XAI Framework, which operate in parallel with the process to optimize various parameters at its different stages. The Objective Function evaluates the model’s classification accuracy.</p>
Full article ">Figure 9
<p>The overall HSMO optimization process. The widening search space at the beginning of the iterations illustrates the diverse solutions explored by the population. Both regions pertaining to binary feature selection and continuous parameter optimization demonstrate convergence as the algorithm iterates. This convergence signifies the collaborative nature of the HSMO algorithm, where local leaders and the global leader guide the population towards more optimal solutions.</p>
Full article ">Figure 10
<p>Resulting feature histograms and optimized feature mapping for LBP, HOG, edge extractors, and Gabor.</p>
Full article ">Figure 10 Cont.
<p>Resulting feature histograms and optimized feature mapping for LBP, HOG, edge extractors, and Gabor.</p>
Full article ">Figure 11
<p>Top: without optimization; left full features; middle BSMO and right BPSO; bottom: with optimization; left full features; and middle HSMO and right HPSO features.</p>
Full article ">Figure 12
<p>Feature heatmap using HSMO optimizer: (<b>a</b>) canny edge detector, (<b>b</b>) Gabor filters, (<b>c</b>) LBP; and (<b>d</b>) HOG—full and optimized feature heatmaps in right and left, respectively.</p>
Full article ">
10 pages, 3009 KiB  
Article
Unsupervised Learning for the Automatic Counting of Grains in Nanocrystals and Image Segmentation at the Atomic Resolution
by Woonbae Sohn, Taekyung Kim, Cheon Woo Moon, Dongbin Shin, Yeji Park, Haneul Jin and Hionsuck Baik
Nanomaterials 2024, 14(20), 1614; https://doi.org/10.3390/nano14201614 - 10 Oct 2024
Viewed by 1070
Abstract
Identifying the grain distribution and grain boundaries of nanoparticles is important for predicting their properties. Experimental methods for identifying the crystallographic distribution, such as precession electron diffraction, are limited by their probe size. In this study, we developed an unsupervised learning method by [...] Read more.
Identifying the grain distribution and grain boundaries of nanoparticles is important for predicting their properties. Experimental methods for identifying the crystallographic distribution, such as precession electron diffraction, are limited by their probe size. In this study, we developed an unsupervised learning method by applying a Gabor filter to HAADF-STEM images at the atomic level for image segmentation and automatic counting of grains in polycrystalline nanoparticles. The methodology comprises a Gabor filter for feature extraction, non-negative matrix factorization for dimension reduction, and K-means clustering. We set the threshold distance and angle between the clusters required for the number of clusters to converge so as to automatically determine the optimal number of grains. This approach can shed new light on the nature of polycrystalline nanoparticles and their structure–property relationships. Full article
(This article belongs to the Special Issue Exploring Nanomaterials through Electron Microscopy and Spectroscopy)
Show Figures

Figure 1

Figure 1
<p>Schematic of the Gabor-filter-based clustering for particle segmentation. (1) Application of multiple Gabor filters, (2) creation of a feature vector for each pixel to obtain a feature matrix, and (3) dimension reduction using NMF followed by K-means clustering. The class vectors are rearranged into a 2D matrix, illustrating the segmented image.</p>
Full article ">Figure 2
<p>Sequence of <span class="html-italic">k</span> in the Au nanoparticles with five-fold twin and colorized segmentation, which are compared with the ground truth. (<b>a</b>) HAADF-STEM image, showing five-fold twins of the particle. Segmentation and colored classes for (<b>b</b>) <span class="html-italic">k</span> = 2; (<b>c</b>) <span class="html-italic">k</span> = 4; (<b>d</b>) <span class="html-italic">k</span> = 6; (<b>e</b>) <span class="html-italic">k</span> = 8. (<b>f</b>) Ground truth of the segmentation of (<b>a</b>). The different colors indicate the different classes after clustering.</p>
Full article ">Figure 3
<p>Segmented images of the Au nanoparticles with various <span class="html-italic">k</span> values. (<b>a</b>) HAADF-STEM image, showing five-fold twins of the particle. Segmentation and color maps for (<b>b</b>) <span class="html-italic">k</span> = 7; (<b>c</b>) <span class="html-italic">k</span> = 8; (<b>d</b>) <span class="html-italic">k</span> = 9; (<b>e</b>) <span class="html-italic">k</span> = 10. The different colors indicate the different classes after clustering.</p>
Full article ">Figure 4
<p>Segmentation of PtNi intermetallic nanoparticles. (<b>a</b>) HAADF-STEM image of PtNi intermetallic nanoparticle; (<b>b</b>) segmented image with <span class="html-italic">k</span> = 5. The different colors indicate the different classes after clustering.</p>
Full article ">Figure 5
<p>Segmentation of the PtNi intermetallic nanoparticles. (<b>a</b>) HAADF STEM image of the PtNi intermetallic nanoparticle; (<b>b</b>) segmentated image with k = 6. The different colors indicate the different classes after clustering.</p>
Full article ">Figure 6
<p>Automated segmentation by setting threshold value with <span class="html-italic">k</span> = 10. (<b>a</b>–<b>d</b>) HAADF-STEM images of intermetallic nanoparticles for image segmentation. With the same <span class="html-italic">k</span> values, those images are segmented with optimal <span class="html-italic">k</span> values of (<b>e</b>) 5, (<b>f</b>) 5, (<b>g</b>) 4, and (<b>h</b>) 9. The different colors indicate the different classes after clustering.</p>
Full article ">
23 pages, 9520 KiB  
Article
Visual Feature-Guided Diamond Convolutional Network for Finger Vein Recognition
by Qiong Yao, Dan Song, Xiang Xu and Kun Zou
Sensors 2024, 24(18), 6097; https://doi.org/10.3390/s24186097 - 20 Sep 2024
Viewed by 783
Abstract
Finger vein (FV) biometrics have garnered considerable attention due to their inherent non-contact nature and high security, exhibiting tremendous potential in identity authentication and beyond. Nevertheless, challenges pertaining to the scarcity of training data and inconsistent image quality continue to impede the effectiveness [...] Read more.
Finger vein (FV) biometrics have garnered considerable attention due to their inherent non-contact nature and high security, exhibiting tremendous potential in identity authentication and beyond. Nevertheless, challenges pertaining to the scarcity of training data and inconsistent image quality continue to impede the effectiveness of finger vein recognition (FVR) systems. To tackle these challenges, we introduce the visual feature-guided diamond convolutional network (dubbed ‘VF-DCN’), a uniquely configured multi-scale and multi-orientation convolutional neural network. The VF-DCN showcases three pivotal innovations: Firstly, it meticulously tunes the convolutional kernels through multi-scale Log-Gabor filters. Secondly, it implements a distinctive diamond-shaped convolutional kernel architecture inspired by human visual perception. This design intelligently allocates more orientational filters to medium scales, which inherently carry richer information. In contrast, at extreme scales, the use of orientational filters is minimized to simulate the natural blurring of objects at extreme focal lengths. Thirdly, the network boasts a deliberate three-layer configuration and fully unsupervised training process, prioritizing simplicity and optimal performance. Extensive experiments are conducted on four FV databases, including MMCBNU_6000, FV_USM, HKPU, and ZSC_FV. The experimental results reveal that VF-DCN achieves remarkable improvement with equal error rates (EERs) of 0.17%, 0.19%, 2.11%, and 0.65%, respectively, and Accuracy Rates (ACC) of 100%, 99.97%, 98.92%, and 99.36%, respectively. These results indicate that, compared with some existing FVR approaches, the proposed VF-DCN not only achieves notable recognition accuracy but also shows fewer number of parameters and lower model complexity. Moreover, VF-DCN exhibits superior robustness across diverse FV databases. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>Radial filters under different values of <math display="inline"><semantics> <msub> <mi>σ</mi> <mi>r</mi> </msub> </semantics></math>.</p>
Full article ">Figure 2
<p>Angular filters under different angular scaling factors <span class="html-italic">T</span>.</p>
Full article ">Figure 3
<p>Bank of Log-Gabor filters. Each row in (<b>c</b>) contains filters computed with the same scale, for each scale, 10 orientations are sampled.</p>
Full article ">Figure 3 Cont.
<p>Bank of Log-Gabor filters. Each row in (<b>c</b>) contains filters computed with the same scale, for each scale, 10 orientations are sampled.</p>
Full article ">Figure 4
<p>Illustration of the framework of VF-DCN.</p>
Full article ">Figure 5
<p>Diamond convolutional structure of VF-DCN.</p>
Full article ">Figure 6
<p>Adaptive orientational filter learning strategy for the convolutional kernels across different scales.</p>
Full article ">Figure 7
<p>ROI images of four FV databases, in which, ROIs in (<b>a</b>,<b>b</b>) are provided by the dataset itself, while ROIs in (<b>c</b>,<b>d</b>) are extracted by 3<span class="html-italic">σ</span> criterion [<a href="#B1-sensors-24-06097" class="html-bibr">1</a>].</p>
Full article ">Figure 7 Cont.
<p>ROI images of four FV databases, in which, ROIs in (<b>a</b>,<b>b</b>) are provided by the dataset itself, while ROIs in (<b>c</b>,<b>d</b>) are extracted by 3<span class="html-italic">σ</span> criterion [<a href="#B1-sensors-24-06097" class="html-bibr">1</a>].</p>
Full article ">Figure 8
<p>Trend of EER at varying parameters.</p>
Full article ">Figure 9
<p>ROC curves of various diamond-shaped convolutional structures on four finger vein databases.</p>
Full article ">
25 pages, 13590 KiB  
Article
Fast and Nondestructive Proximate Analysis of Coal from Hyperspectral Images with Machine Learning and Combined Spectra-Texture Features
by Jihua Mao, Hengqian Zhao, Yu Xie, Mengmeng Wang, Pan Wang, Yaning Shi and Yusen Zhao
Appl. Sci. 2024, 14(17), 7920; https://doi.org/10.3390/app14177920 - 5 Sep 2024
Cited by 1 | Viewed by 1348
Abstract
Proximate analysis, including ash, volatile matter, moisture, fixed carbon, and calorific value, is a fundamental aspect of fuel testing and serves as the primary method for evaluating coal quality, which is critical for the processing and utilization of coal. The traditional analytical methods [...] Read more.
Proximate analysis, including ash, volatile matter, moisture, fixed carbon, and calorific value, is a fundamental aspect of fuel testing and serves as the primary method for evaluating coal quality, which is critical for the processing and utilization of coal. The traditional analytical methods involve time-consuming and costly combustion processes, particularly when applied to large volumes of coal that need to be sampled in massive batches. Hyperspectral imaging is promising for the rapid and nondestructive determination of coal quality indices. In this study, a fast and nondestructive coal proximate analysis method with combined spectral-spatial features was developed using a hyperspectral imaging system in the 450–2500 nm range. The processed spectra were evaluated using PLSR, with the most effective MSC spectra selected. To reduce the spectral redundancy and improve the accuracy, the SPA, Boruta, iVISSA, and CARS algorithms were adopted to extract the characteristic wavelengths, and 16 prediction models were constructed and optimized based on the PLSR, RF, BPNN, and LSSVR algorithms within the Optuna framework for each quality indicator. For spatial information, the histogram statistics, gray-level covariance matrix, and Gabor filters were employed to extract the texture features within the characteristic wavelengths. The texture feature-based and combined spectral-texture feature-based prediction models were constructed by applying the spectral modeling strategy, respectively. Compared with the models based on spectral or texture features only, the LSSVR models with combined spectral-texture features achieved the highest prediction accuracy in all quality metrics, with Rp2 values of 0.993, 0.989, 0.979, 0.948, and 0.994 for Ash, VM, MC, FC, and CV, respectively. This study provides a technical reference for hyperspectral imaging technology as a new method for the rapid, nondestructive proximate analysis and quality assessment of coal. Full article
(This article belongs to the Section Optics and Lasers)
Show Figures

Figure 1

Figure 1
<p>Research flow chart of the study.</p>
Full article ">Figure 2
<p>Pseudo-color images (817 nm, 661 nm, and 549 nm) of four coal samples and the measured values of quality indices. The samples are arranged from left to right by CV, from highest to lowest.</p>
Full article ">Figure 3
<p>Reflectance spectra and characteristic wavelengths obtained by averaging pixels in the region of interest from hyperspectral images of 61 coal samples. Each different colored curve represents each coal sample.</p>
Full article ">Figure 4
<p>Evolution of the noise level with wavelength evaluated based on spectra of all coal samples. Wavelengths with prominent noise spikes (&gt;0.5%) have been excluded from further analysis.</p>
Full article ">Figure 5
<p>The spectral curves of (<b>a</b>) raw and preprocessed reflectance of all coal samples using (<b>b</b>) SG, (<b>c</b>) FD, and (<b>d</b>) MSC methods. The red curves indicate the mean spectra of all coal samples, and the gray shadows represent the spectra reflectance range.</p>
Full article ">Figure 6
<p>Results of characteristic wavelength extraction (marked by red square). The characteristic wavelengths of Ash, VM, MC, and CV were extracted by CARS. The characteristic wavelengths of FC were extracted by Boruta.</p>
Full article ">Figure 7
<p>Scatter plots of actual and predicted coal quality indices values obtained using the optimal LSSVR model based on the combined spectra-texture features. (<b>a</b>) Ash; (<b>b</b>) VM; (<b>c</b>) MC; (<b>d</b>) FC; and (<b>e</b>) CV.</p>
Full article ">Figure 8
<p>Contributions of the top ten significant variables by SHAP values in the coal quality indices optimal prediction models.</p>
Full article ">Figure 9
<p>Relative contribution of coal quality indices based on mean absolute SHAP values.</p>
Full article ">Figure 10
<p>Predictive distribution of coal quality indices by combined spectra-textual feature based on hyperspectral images. The four samples in each index correspond to the minimum, 25%, 75%, and maximum values in the dataset from left to right, respectively.</p>
Full article ">
18 pages, 16408 KiB  
Article
Enhanced Scratch Detection for Textured Materials Based on Optimized Photometric Stereo Vision and Fast Fourier Transform–Gabor Filtering
by Yaoshun Yue, Wenpeng Sang, Kaiwei Zhai and Maohai Lin
Appl. Sci. 2024, 14(17), 7812; https://doi.org/10.3390/app14177812 - 3 Sep 2024
Viewed by 1209
Abstract
In the process of scratch defect detection in textured materials, there are often problems of low efficiency in traditional manual detection, large errors in machine vision, and difficulty in distinguishing defective scratches from the background texture. In order to solve these problems, we [...] Read more.
In the process of scratch defect detection in textured materials, there are often problems of low efficiency in traditional manual detection, large errors in machine vision, and difficulty in distinguishing defective scratches from the background texture. In order to solve these problems, we developed an enhanced scratch defect detection system for textured materials based on optimized photometric stereo vision and FFT-Gabor filtering. We designed and optimized a novel hemispherical image acquisition device that allows for selective lighting angles. This device integrates images captured under multiple light sources to obtain richer surface gradient information for textured materials, overcoming issues caused by high reflections or dark shadows under a single light source angle. At the same time, for the textured material, scratches and a textured background are difficult to distinguish; therefore, we introduced a Gabor filter-based convolution kernel, leveraging the fast Fourier transform (FFT), to perform convolution operations and spatial domain phase subtraction. This process effectively enhances the defect information while suppressing the textured background. The effectiveness and superiority of the proposed method were validated through material applicability experiments and comparative method evaluations using a variety of textured material samples. The results demonstrated a stable scratch capture success rate of 100% and a recognition detection success rate of 98.43% ± 1.0%. Full article
(This article belongs to the Section Applied Industrial Technologies)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Due to the micro-geometry, some micro-planes are occluded and do not receive light (shadowing). (<b>b</b>) Light reflected from micro-planes that cannot be seen from the observation direction is also not visible (masking).</p>
Full article ">Figure 2
<p>Enhanced scratch detection for textured materials based on optimized photometric stereo vision and FFT-Gabor filtering.</p>
Full article ">Figure 3
<p>Light-source-selective image acquisition device based on photometric stereo vision: (<b>A</b>) presents the principle of photometric stereo vision, and (<b>B</b>) presents the principle of application based on photometric stereo vision.</p>
Full article ">Figure 4
<p>The impact of different numbers of light sources on image quality evaluated.</p>
Full article ">Figure 5
<p>Photometric stereo vision based on input images from 8 light sources.</p>
Full article ">Figure 6
<p>Photometric stereo vision image acquisition for different textured materials. (<b>A</b>) represents coarse textured leather; (<b>B</b>) represents fine textured leather; (<b>C</b>) represents textile fabric; and (<b>D</b>) represents textured kraft paper.</p>
Full article ">Figure 7
<p>Framework for scratch defect detection on textured material surface based on image enhancement algorithm.</p>
Full article ">Figure 8
<p>The effect of image contrast enhancement after fast Fourier transform based on Gabor filter. Sample A represents fine textured leather, sample B represents coarse textured leather, and sample C represents light textured leather.</p>
Full article ">Figure 9
<p>Validation of textured material applicability of detection methods. Sample A represents coarse-textured kraft paper; Sample B represents fine-textured leather; Sample C represents coarse linen fabric; Sample D represents fine-textured kraft paper; and Sample E represents a textured wood panel.</p>
Full article ">Figure 10
<p>Validation of methodological superiority. (<b>A</b>) represents a fine-textured leather material with deep and dense self-texture; (<b>B</b>) represents a coarse-textured leather material with deep and irregular self-texture; (<b>C</b>) represents a fine-textured leather material with shallow and relatively regular self-texture.</p>
Full article ">
33 pages, 30114 KiB  
Article
Exploring the Influence of Object, Subject, and Context on Aesthetic Evaluation through Computational Aesthetics and Neuroaesthetics
by Fangfu Lin, Wanni Xu, Yan Li and Wu Song
Appl. Sci. 2024, 14(16), 7384; https://doi.org/10.3390/app14167384 - 21 Aug 2024
Cited by 1 | Viewed by 1423
Abstract
Background: In recent years, computational aesthetics and neuroaesthetics have provided novel insights into understanding beauty. Building upon the findings of traditional aesthetics, this study aims to combine these two research methods to explore an interdisciplinary approach to studying aesthetics. Method: Abstract artworks were [...] Read more.
Background: In recent years, computational aesthetics and neuroaesthetics have provided novel insights into understanding beauty. Building upon the findings of traditional aesthetics, this study aims to combine these two research methods to explore an interdisciplinary approach to studying aesthetics. Method: Abstract artworks were used as experimental materials. Based on traditional aesthetics and in combination, features of composition, tone, and texture were selected. Computational aesthetic methods were then employed to correspond these features to physical quantities: blank space, gray histogram, Gray Level Co-occurrence Matrix (GLCM), Local Binary Pattern (LBP), and Gabor filters. An electroencephalogram (EEG) experiment was carried out, in which participants conducted aesthetic evaluations of the experimental materials in different contexts (genuine, fake), and their EEG data were recorded to analyze the impact of various feature classes in the aesthetic evaluation process. Finally, a Support Vector Machines (SVMs) was utilized to model the feature data, Event-Related Potentials (ERPs), context data, and subjective aesthetic evaluation data. Result: Behavioral data revealed higher aesthetic ratings in the genuine context. ERP data indicated that genuine contexts elicited more negative deflections in the prefrontal lobes between 200 and 1000 ms. Class II compositions demonstrated more positive deflections in the parietal lobes at 50–120 ms, while Class I tones evoked more positive amplitudes in the occipital lobes at 200–300 ms. Gabor features showed significant variations in the parieto-occipital area at an early stage. Class II LBP elicited a prefrontal negative wave with a larger amplitude. The results of the SVM models indicated that the model incorporating aesthetic subject and context data (ACC = 0.76866) outperforms the model using only parameters of the aesthetic object (ACC = 0.68657). Conclusion: A positive context tends to provide participants with a more positive aesthetic experience, but abstract artworks may not respond to this positivity. During aesthetic evaluation, the ERP data activated by different features show a trend from global to local. The SVM model based on multimodal data fusion effectively predicts aesthetics, further demonstrating the feasibility of the combined research approach of computational aesthetics and neuroaesthetics. Full article
Show Figures

Figure 1

Figure 1
<p>The calculation of blank space in Suprematist Composition: Airplane Flying (images processed by authors as fair use from wikiart.org) <a href="https://www.wikiart.org/en/kazimir-malevich/aeroplane-flying-1915" target="_blank">https://www.wikiart.org/en/kazimir-malevich/aeroplane-flying-1915</a> (accessed on 4 March 2024).</p>
Full article ">Figure 2
<p>Kernels of different wavelengths <math display="inline"><semantics> <mrow> <mi>λ</mi> </mrow> </semantics></math> and angles <math display="inline"><semantics> <mrow> <mi>θ</mi> </mrow> </semantics></math>.</p>
Full article ">Figure 3
<p>Example images of features.</p>
Full article ">Figure 4
<p>Illustration of the stimulus paradigm applied.</p>
Full article ">Figure 5
<p>Grand–average event–related brain potentials and isopotential contour plot (200–1000 ms) for genuine and fake context. <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 6
<p>Grand–average event–related brain potentials and isopotential contour plot (50–120 ms) for context (genuine, fake) × composition (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 7
<p>Grand–average event–related brain potentials and isopotential contour plot (200–300 ms) for context (genuine, fake) × tone (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 8
<p>Grand–average event–related brain potentials and isopotential contour plot (70–130 ms) for context (genuine, fake) × Gabor–Mean (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 9
<p>Grand–average event–related brain potentials and isopotential contour plot (70–130 ms) for context (genuine, fake) × Gabor–Variance (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 10
<p>Grand–average event–related brain potentials and isopotential contour plot (70–130 ms and 200–300 ms) for context (genuine, fake) × Gabor–Energy (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 11
<p>Grand–average event–related brain potentials and isopotential contour plot (500–1000 ms) for context (genuine, fake) × horizontal GLCM (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 12
<p>Grand–average event–related brain potentials and isopotential contour plot (70–140 ms and 500–1000 ms) for context (genuine, fake) × diagonal GLCM (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 13
<p>Grand–average event–related brain potentials and isopotential contour plot (300–1000 ms) for context (genuine, fake) × LBP (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 14
<p>Performance of SVM models with varying C and γ values: (<b>a</b>) the ACC with different C and γ combinations; (<b>b</b>) the AUC with different C and γ combinations. (The closer the color is to red, the higher the value; the closer it is to blue, the lower the value).</p>
Full article ">
22 pages, 12904 KiB  
Article
Intelligent Classification and Segmentation of Sandstone Thin Section Image Using a Semi-Supervised Framework and GL-SLIC
by Yubo Han and Ye Liu
Minerals 2024, 14(8), 799; https://doi.org/10.3390/min14080799 - 5 Aug 2024
Cited by 1 | Viewed by 1210
Abstract
This study presents the development and validation of a robust semi-supervised learning framework specifically designed for the automated segmentation and classification of sandstone thin section images from the Yanchang Formation in the Ordos Basin. Traditional geological image analysis methods encounter significant challenges due [...] Read more.
This study presents the development and validation of a robust semi-supervised learning framework specifically designed for the automated segmentation and classification of sandstone thin section images from the Yanchang Formation in the Ordos Basin. Traditional geological image analysis methods encounter significant challenges due to the labor-intensive and error-prone nature of manual labeling, compounded by the diversity and complexity of rock thin sections. Our approach addresses these challenges by integrating the GL-SLIC algorithm, which combines Gabor filters and Local Binary Patterns for effective superpixel segmentation, laying the groundwork for advanced component identification. The primary innovation of this research is the semi-supervised learning model that utilizes a limited set of manually labeled samples to generate high-confidence pseudo labels, thereby significantly expanding the training dataset. This methodology effectively tackles the critical challenge of insufficient labeled data in geological image analysis, enhancing the model’s generalization capability from minimal initial input. Our framework improves segmentation accuracy by closely aligning superpixels with the intricate boundaries of mineral grains and pores. Additionally, it achieves substantial improvements in classification accuracy across various rock types, reaching up to 96.3% in testing scenarios. This semi-supervised approach represents a significant advancement in computational geology, providing a scalable and efficient solution for detailed petrographic analysis. It not only enhances the accuracy and efficiency of geological interpretations but also supports broader hydrocarbon exploration efforts. Full article
Show Figures

Figure 1

Figure 1
<p>Thin section image of sandstone under plane-polarized light: the main components are quartz, kaolinite, matrix, pores and lithic fragments.</p>
Full article ">Figure 2
<p>Workflow for recognizing minerals using GL-SLIC segmentation and semi-supervised training.</p>
Full article ">Figure 3
<p>GL-BP feature extraction workflow integrating LBP operator and Gabor filters for sandstone thin section images.</p>
Full article ">Figure 4
<p>Feature extraction visualization using Gabor filters at various scales and orientations for sandstone thin section images.</p>
Full article ">Figure 5
<p>Mean feature comparison chart: (<b>a</b>) mean feature chart at scale 1; (<b>b</b>) mean feature chart at scale 2; (<b>c</b>) mean feature chart at scale 3; (<b>d</b>) mean feature chart at scale 4; (<b>e</b>) mean feature chart at scale 5; (<b>f</b>) mean feature chart at scale 6.</p>
Full article ">Figure 6
<p>LBP Feature Extraction: (<b>a</b>) mean feature chart at scale 1; (<b>b</b>) mean feature chart at scale 2; (<b>c</b>) mean feature chart at scale 3; (<b>d</b>) mean feature chart at scale 4; (<b>e</b>) mean feature chart at scale 5; (<b>f</b>) mean feature chart at scale 6.</p>
Full article ">Figure 7
<p>Semi-supervised self-training process.</p>
Full article ">Figure 8
<p>Modified VGG16 Classifier Architecture.</p>
Full article ">Figure 9
<p>Discriminator model architecture.</p>
Full article ">Figure 10
<p>Comparison of superpixel segmentation algorithms on sandstone images: (<b>a</b>) original sandstone image; (<b>b</b>) FH; (<b>c</b>) QS; (<b>d</b>) SEEDS; (<b>e</b>) Watershed; (<b>f</b>) LSC; (<b>g</b>) SLIC; (<b>h</b>) GL-SLIC.</p>
Full article ">Figure 11
<p>Comparison of segmentation results between SLIC and GL-SLIC algorithms: (<b>a</b>) pre-segmentation result by the SLIC algorithm; (<b>b</b>) pre-segmentation result using the GL-SLIC algorithm.</p>
Full article ">Figure 12
<p>Detailed comparison between SLIC and GL-SLIC algorithms: (<b>a1</b>) detail area a1 from SLIC; (<b>a2</b>) detail area a2 from SLIC; (<b>a3</b>) detail area a3 from SLIC; (<b>b1</b>) detail area b1 from GL-SLIC; (<b>b2</b>) detail area b2 from GL-SLIC; (<b>b3</b>) detail area b3 from GL-SLIC.</p>
Full article ">Figure 13
<p>Comparison of superpixel merging in medium-coarse-grained quartz sandstone: (<b>a</b>) medium-coarse-grained quartz sandstone image; (<b>b</b>) pre-segmentation result; (<b>c</b>) result after superpixel merging.</p>
Full article ">Figure 14
<p>Iterative model training and data augmentation process using labeled and unlabeled rock data to mitigate overfitting and enhance classification accuracy: (<b>a</b>) primary model; (<b>b</b>) discriminator model.</p>
Full article ">Figure 15
<p>Curves of training and testing accuracy variation with epochs for the primary model.</p>
Full article ">Figure 16
<p>Classification accuracy analysis: (<b>a</b>) training set confusion matrix, (<b>b</b>) test set confusion matrix.</p>
Full article ">Figure 17
<p>Improved model accuracy post dataset cleansing and enhancement.</p>
Full article ">Figure 18
<p>Final confusion matrices for model evaluation: (<b>a</b>) training data confusion matrix, (<b>b</b>) testing data confusion matrix.</p>
Full article ">Figure 19
<p>Component identification results: (<b>a</b>) original petrographic thin section images; (<b>b</b>) proposed method results; (<b>c</b>) UNet-based semantic segmentation results.</p>
Full article ">
26 pages, 3348 KiB  
Article
Hybrid Feature Mammogram Analysis: Detecting and Localizing Microcalcifications Combining Gabor, Prewitt, GLCM Features, and Top Hat Filtering Enhanced with CNN Architecture
by Miguel Alejandro Hernández-Vázquez, Yazmín Mariela Hernández-Rodríguez, Fausto David Cortes-Rojas, Rafael Bayareh-Mancilla and Oscar Eduardo Cigarroa-Mayorga
Diagnostics 2024, 14(15), 1691; https://doi.org/10.3390/diagnostics14151691 - 5 Aug 2024
Cited by 3 | Viewed by 1792
Abstract
Breast cancer is a prevalent malignancy characterized by the uncontrolled growth of glandular epithelial cells, which can metastasize through the blood and lymphatic systems. Microcalcifications, small calcium deposits within breast tissue, are critical markers for early detection of breast cancer, especially in non-palpable [...] Read more.
Breast cancer is a prevalent malignancy characterized by the uncontrolled growth of glandular epithelial cells, which can metastasize through the blood and lymphatic systems. Microcalcifications, small calcium deposits within breast tissue, are critical markers for early detection of breast cancer, especially in non-palpable carcinomas. These microcalcifications, appearing as small white spots on mammograms, are challenging to identify due to potential confusion with other tissues. This study hypothesizes that a hybrid feature extraction approach combined with Convolutional Neural Networks (CNNs) can significantly enhance the detection and localization of microcalcifications in mammograms. The proposed algorithm employs Gabor, Prewitt, and Gray Level Co-occurrence Matrix (GLCM) kernels for feature extraction. These features are input to a CNN architecture designed with maxpooling layers, Rectified Linear Unit (ReLU) activation functions, and a sigmoid response for binary classification. Additionally, the Top Hat filter is used for precise localization of microcalcifications. The preprocessing stage includes enhancing contrast using the Volume of Interest Look-Up Table (VOI LUT) technique and segmenting regions of interest. The CNN architecture comprises three convolutional layers, three ReLU layers, and three maxpooling layers. The training was conducted using a balanced dataset of digital mammograms, with the Adam optimizer and binary cross-entropy loss function. Our method achieved an accuracy of 89.56%, a sensitivity of 82.14%, and a specificity of 91.47%, outperforming related works, which typically report accuracies around 85–87% and sensitivities between 76 and 81%. These results underscore the potential of combining traditional feature extraction techniques with deep learning models to improve the detection and localization of microcalcifications. This system may serve as an auxiliary tool for radiologists, enhancing early detection capabilities and potentially reducing diagnostic errors in mass screening programs. Full article
(This article belongs to the Special Issue Quantitative and Intelligent Analysis of Medical Imaging, 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Overview of the CNN architecture summarizing the methods section. It illustrates the process from input mammogram, through convolutional, ReLU, and maxpooling layers, to the final classification of mammograms as positive or negative for MCs. The green circle marks the location of the MCs.</p>
Full article ">Figure 2
<p>CNN architecture for detecting microcalcifications in mammograms, including convolutional, ReLU, maxpooling, flattening, dense, and output layers.</p>
Full article ">Figure 3
<p>Pipeline for localizing MCs in mammographic images. The process includes preprocessing, applying morphological operations (erosion and dilation) with a disk-shaped structuring element, and using the Top Hat transformation to highlight microcalcifications in the final processed image.</p>
Full article ">Figure 4
<p>(<b>a</b>) Mammograms without VOI-LUT transformation, showing unnormalized contrast and visible annotations from medical examinations and equipment, (<b>b</b>) mammograms with VOI-LUT transformation, demonstrating normalized contrast and the removal of annotations, resulting in improved visibility of microcalcifications.</p>
Full article ">Figure 5
<p>Feature extraction stage using Prewitt, GLCM, and Gabor kernels. The columns show the original mammogram, the application of each kernel, and the resulting feature maps after convolution, maxpooling, and ReLU stages in the CNN. The red box denotes the section that was processed to illustrate this example.</p>
Full article ">Figure 6
<p>Training and validation loss and accuracy over 50 epochs, demonstrating the convergence and stability of the CNN model during the learning process with a dataset of 5121 mammograms.</p>
Full article ">Figure 7
<p>Analysis of hyperparameter tuning impact.</p>
Full article ">Figure 8
<p>ROC curves for the non-tuned and tuned models demonstrate the improvement in performance after hyperparameter tuning. The true positive rate (sensitivity) is plotted against the false positive rate (1—specificity) for various threshold values.</p>
Full article ">Figure 9
<p>Violin plots of predicted probabilities for true positives, false positives, true negatives, and false negatives. The plots display the distribution of predicted probabilities, combining box plots and density plots to provide a comprehensive view of the data.</p>
Full article ">Figure 10
<p>Counts of model prediction outcomes verified by radiologist.</p>
Full article ">
17 pages, 14796 KiB  
Article
Application of Gabor, Log-Gabor, and Adaptive Gabor Filters in Determining the Cut-Off Wavelength Shift of TFBG Sensors
by Sławomir Cięszczyk
Appl. Sci. 2024, 14(15), 6394; https://doi.org/10.3390/app14156394 - 23 Jul 2024
Cited by 1 | Viewed by 997
Abstract
Tilted fibre Bragg gratings are optical fibre structures used as sensors of various physical quantities. Their unique measurement capabilities result from the high complexity of the optical spectrum consisting of several dozen cladding mode resonances. TFBG spectra demodulation methods generate signal features that [...] Read more.
Tilted fibre Bragg gratings are optical fibre structures used as sensors of various physical quantities. Their unique measurement capabilities result from the high complexity of the optical spectrum consisting of several dozen cladding mode resonances. TFBG spectra demodulation methods generate signal features that highlight changes in the spectrum due to changes in the interacting quantities. Such methods should enable the distinction between two slightly different values of the measured quantity. The paper presents an effective method of processing the TFBG spectrum for use in measuring the refractive index of liquids. The use of Gabor and log-Gabor filters and their adaptive version eliminates the problem of discontinuity in determining the SRI value related to the existence of the cladding mode comb. The Gabor filters used make visible the shifting and fading of spectral features related to the decrease in the intensity of leaking modes. Subsequent modifications of the proposed algorithm led to an increase in the quality factor of the processed spectrum. Full article
(This article belongs to the Section Optics and Lasers)
Show Figures

Figure 1

Figure 1
<p>Schematic view of the experimental setup.</p>
Full article ">Figure 2
<p>Transmission spectra of a TFBG grating immersed in solutions with different SRI values.</p>
Full article ">Figure 3
<p>Fourier transform of the optical spectrum of the TFBG grating for the 15 SRI values considered.</p>
Full article ">Figure 4
<p>Impulse response of the real and imaginary parts of the Gabor filters adjusted to the main frequency associated with the occurrence of cladding modes.</p>
Full article ">Figure 5
<p>Unprocessed TFBG spectrum and its real part after filtration with a Gabor filter.</p>
Full article ">Figure 6
<p>Real and imaginary part of the spectrum after Gabor filtering, along with the envelope.</p>
Full article ">Figure 7
<p>Envelopes of the TFBG spectra after Gabor filtering for different SRI values.</p>
Full article ">Figure 8
<p>Derivatives (first differences) of the envelopes from <a href="#applsci-14-06394-f006" class="html-fig">Figure 6</a>.</p>
Full article ">Figure 9
<p>Shift of the maximum value of derivative as a function of SRI.</p>
Full article ">Figure 10
<p>Impulse response of the real and imaginary parts of the Gabor filter adjusted to the frequency of the second harmonic of the TFBG transmission spectrum.</p>
Full article ">Figure 11
<p>Frequency spectra calculated on the basis of the TFBG transmission spectra, along with the frequency response of the Gabor filter adjusted to the second harmonic of the cladding mode comb.</p>
Full article ">Figure 12
<p>The real and imaginary parts of the spectrum of cladding modes after filtration with a Gabor filter, along with the envelope.</p>
Full article ">Figure 13
<p>Derivatives of the cladding mode envelopes for the second harmonic of the optical spectra.</p>
Full article ">Figure 14
<p>Shift of the cut-off wavelength as a function of the SRI value for the spectra of the second harmonic components of cladding modes.</p>
Full article ">Figure 15
<p>TFBG frequency spectra and frequency response of the log-Gabor filter.</p>
Full article ">Figure 16
<p>Derivatives of the envelopes calculated using the log-Gabor filter matched to the first harmonic of the optical spectra.</p>
Full article ">Figure 17
<p>Adaptive matching of centre frequency of the log-Gabor filter.</p>
Full article ">Figure 18
<p>Envelope derivatives of optical spectra processed by the adaptive log-Gabor filter.</p>
Full article ">Figure 19
<p>Frequency response of the adaptive log-Gabor filter matched to frequency spectra of the optical spectra.</p>
Full article ">Figure 20
<p>Envelope derivatives for an adaptive log-Gabor filter fitted to the second harmonic of the optical spectra.</p>
Full article ">Figure 21
<p>Envelopes determined with a log-Gabor filter based on the first derivative of the TFBG transmission spectra.</p>
Full article ">Figure 22
<p>Envelope derivatives for an adaptive log-Gabor filter fitted to the first harmonic based on the first derivative of the TFBG spectra.</p>
Full article ">Figure 23
<p>Dependence of the cut-off wavelength shift on the SRI coefficient value for optical spectra processed by an adaptive log-Gabor filter.</p>
Full article ">Figure 24
<p>Envelope derivatives for the first derivative of TFBG spectra processed by an adaptive log-Gabor filter fitted to second harmonic.</p>
Full article ">Figure 25
<p>Dependence of the cut-off wavelength shift on the SRI value.</p>
Full article ">
22 pages, 3024 KiB  
Article
Augmenting Aquaculture Efficiency through Involutional Neural Networks and Self-Attention for Oplegnathus Punctatus Feeding Intensity Classification from Log Mel Spectrograms
by Usama Iqbal, Daoliang Li, Zhuangzhuang Du, Muhammad Akhter, Zohaib Mushtaq, Muhammad Farrukh Qureshi and Hafiz Abbad Ur Rehman
Animals 2024, 14(11), 1690; https://doi.org/10.3390/ani14111690 - 5 Jun 2024
Cited by 4 | Viewed by 1206
Abstract
Understanding the feeding dynamics of aquatic animals is crucial for aquaculture optimization and ecosystem management. This paper proposes a novel framework for analyzing fish feeding behavior based on a fusion of spectrogram-extracted features and deep learning architecture. Raw audio waveforms are first transformed [...] Read more.
Understanding the feeding dynamics of aquatic animals is crucial for aquaculture optimization and ecosystem management. This paper proposes a novel framework for analyzing fish feeding behavior based on a fusion of spectrogram-extracted features and deep learning architecture. Raw audio waveforms are first transformed into Log Mel Spectrograms, and a fusion of features such as the Discrete Wavelet Transform, the Gabor filter, the Local Binary Pattern, and the Laplacian High Pass Filter, followed by a well-adapted deep model, is proposed to capture crucial spectral and spectral information that can help distinguish between the various forms of fish feeding behavior. The Involutional Neural Network (INN)-based deep learning model is used for classification, achieving an accuracy of up to 97% across various temporal segments. The proposed methodology is shown to be effective in accurately classifying the feeding intensities of Oplegnathus punctatus, enabling insights pertinent to aquaculture enhancement and ecosystem management. Future work may include additional feature extraction modalities and multi-modal data integration to further our understanding and contribute towards the sustainable management of marine resources. Full article
(This article belongs to the Special Issue Animal Health and Welfare in Aquaculture)
Show Figures

Figure 1

Figure 1
<p>The structure of the experimental system of recirculating aquaculture.</p>
Full article ">Figure 2
<p>Audio waveforms captured from aquaculture: (<b>a</b>) Audio of None class, (<b>b</b>) Audio of Medium class, and (<b>c</b>) Audio of Strong class.</p>
Full article ">Figure 3
<p>Conversion of audio waveforms to Log Mel Spectrograms (using magma colormap): (<b>a</b>) audio waveform, (<b>b</b>) corresponding short-term Fourier Transform, (<b>c</b>) Mel spectrogram, and (<b>d</b>) Log Mel Spectrogram.</p>
Full article ">Figure 4
<p>Segmentation of audio waveforms and their corresponding Log Mel Spectrograms (using magma colormap): (<b>a</b>) Audio of None class, (<b>b</b>) Audio of Medium class, (<b>c</b>) Audio of Strong class, (<b>d</b>) Log Mel Spectrogram of None class, (<b>e</b>) Log Mel Spectrogram of Medium class, and (<b>f</b>) Log Mel Spectrogram of Strong class.</p>
Full article ">Figure 5
<p>Discrete wavelet transform extracted from the corresponding Log Mel Spectrogram Images: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 6
<p>Gabor filter applied on Log Mel Spectrogram Images of each class: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 7
<p>Local Binary Pattern extracted from Log Mel Spectrogram Images of each class: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 8
<p>Laplacian High Pass filter applied on Log Mel Spectrogram Images of each class: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 9
<p>Combined Features extracted from LMS for a sample of ’Strong’ class as an input to the model.</p>
Full article ">Figure 10
<p>Involutional Neural Network: (<b>a</b>) involution layer, (<b>b</b>) involution layer with self-attention.</p>
Full article ">Figure A1
<p>Confusion matrices: (<b>a</b>) Involutional Neural Network, (<b>b</b>) VGG16, (<b>c</b>) VGG19, (<b>d</b>) ResNet50, (<b>e</b>) Xception, (<b>f</b>) EfficinetNet-b0, (<b>g</b>) InceptionNetV3, and (<b>h</b>) MobileNetV2.</p>
Full article ">
Back to TopTop