[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (90)

Search Parameters:
Keywords = ship radiated noise

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 7710 KiB  
Article
Research on Underwater Acoustic Target Recognition Based on a 3D Fusion Feature Joint Neural Network
by Weiting Xu, Xingcheng Han, Yingliang Zhao, Liming Wang, Caiqin Jia, Siqi Feng, Junxuan Han and Li Zhang
J. Mar. Sci. Eng. 2024, 12(11), 2063; https://doi.org/10.3390/jmse12112063 - 14 Nov 2024
Viewed by 723
Abstract
In the context of a complex marine environment, extracting and recognizing underwater acoustic target features using ship-radiated noise present significant challenges. This paper proposes a novel deep neural network model for underwater target recognition, which integrates 3D Mel frequency cepstral coefficients (3D-MFCC) and [...] Read more.
In the context of a complex marine environment, extracting and recognizing underwater acoustic target features using ship-radiated noise present significant challenges. This paper proposes a novel deep neural network model for underwater target recognition, which integrates 3D Mel frequency cepstral coefficients (3D-MFCC) and 3D Mel features derived from ship audio signals as inputs. The model employs a serial architecture that combines a convolutional neural network (CNN) with a long short-term memory (LSTM) network. It replaces the traditional CNN with a multi-scale depthwise separable convolutional network (MSDC) and incorporates a multi-scale channel attention mechanism (MSCA). The experimental results demonstrate that the average recognition rate of this method reaches 87.52% on the DeepShip dataset and 97.32% on the ShipsEar dataset, indicating a strong classification performance. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>The overall frame diagram after feature extraction; the original audio is used as the input of the subsequent joint neural network, and the recognition result is obtained through training. In the 3D-MFCC and 3D-Mel feature extraction module, the three channels are labeled from outside to inside as follows: the first channel is labeled as MFCC and MEL, the second channel is labeled as delta-MFCC and delta-Mel, and the third channel is labeled as delta-delta-MFCC and delta-delta-Mel.</p>
Full article ">Figure 2
<p>Feature extraction module. The process of extracting 3D-Mel features and 3D-MFCC features of the original audio and fusing them into new 3D fusion features, respectively.</p>
Full article ">Figure 3
<p>DW convolution process. The DW convolution adjusts the size of the input feature map, but the number of channels does not change.</p>
Full article ">Figure 4
<p><b>PW convolution</b> process. The convolution nuclei taken with PW convolution pairs are all <math display="inline"><semantics> <mrow> <mn>1</mn> <mo>×</mo> <mn>1</mn> </mrow> </semantics></math> in size, and the filter contains the same number of convolution nuclei as the number of channels in the previous layer.</p>
Full article ">Figure 5
<p>MSDC module. The input feature map is divided into three parts according to the channel, and the convolution of <math display="inline"><semantics> <mrow> <mn>3</mn> <mo>×</mo> <mn>3</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mn>5</mn> <mo>×</mo> <mn>5</mn> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <mn>7</mn> <mo>×</mo> <mn>7</mn> </mrow> </semantics></math> is carried out, respectively, and then merged into one feature map.</p>
Full article ">Figure 6
<p>MSCA module. The feature map undergoes GAP compression, followed by parallel 1D convolutions of sizes 3, 5, and 7. The resulting vectors are added, activated by an S-shaped function to produce channel attention, and then multiplied by the input feature map.</p>
Full article ">Figure 7
<p>Structure of an LSTM block. The long short-term memory network consists of a central node and three gated units, which are often called the input gate, output gate, and forget gate.</p>
Full article ">Figure 8
<p>The loss curve, accuracy curve, and confusion matrix for the DeepShip dataset after using the neural recognition network: (<b>a</b>) loss curve, (<b>b</b>) accuracy curve, (<b>c</b>) confusion matrices. The confusion matrix uses the labels (0, 1, 2, 3) to represent the recognition classification results and the shade of the color represents the probability.</p>
Full article ">Figure 9
<p>The loss curve, accuracy curve, and confusion matrix for the ShipsEar dataset after the neural recognition network: (<b>a</b>) loss curve, (<b>b</b>) accuracy curve, (<b>c</b>) confusion matrices. The confusion matrix uses the labels (0, 1, 2, 3, 4) to represent the recognition classification results and the shade of the color represents the probability.</p>
Full article ">Figure 10
<p>The recognition accuracy of 3D-Mel, 3D-MFCC, and 3D fusion extraction based on the CNN for the DeepShip dataset. (<b>a</b>) 3D-Mel; (<b>b</b>) 3D-MFCC; (<b>c</b>) 3D fusion extraction.</p>
Full article ">Figure 11
<p>Comparison of 3D-Mel, 3D-MFCC, and 3D fusion feature extraction based on CNN for the DeepShip dataset. Comparison of the accuracy, precision, recall, and F1-score parameters.</p>
Full article ">Figure 12
<p>Visual comparison of the classification performance of three different feature extraction methods—CNN-based 3D-Mel, 3D-MFCC, and the proposed 3D fusion extraction—for the DeepShip dataset. Each panel represents the confusion matrix for one of the methods: (<b>a</b>) 3D-Mel, (<b>b</b>) 3D-MFCC, and (<b>c</b>) 3D fusion extraction. The confusion matrix uses the labels (0, 1, 2, 3) to represent the recognition classification results and the shade of the color represents the probability.</p>
Full article ">Figure 13
<p>The recognition accuracy of 3D-Mel, 3D-MFCC, and 3D fusion extraction based on the CNN for the ShipsEar dataset. (<b>a</b>) 3D-Mel; (<b>b</b>) 3D-MFCC; (<b>c</b>) 3D fusion extraction.</p>
Full article ">Figure 14
<p>Comparison of 3D-Mel, 3D-MFCC, and 3D fusion feature extraction methods based on the CNN for the ShipsEar dataset. Comparison of accuracy, precision, recall, and F1-score parameters.</p>
Full article ">Figure 15
<p>Visual comparison of the classification performance of three different feature extraction methods—CNN-based 3D-Mel, 3D-MFCC, and the proposed 3D fusion extraction—for the ShipsEar dataset. Each panel represents the confusion matrix for one of the methods: (<b>a</b>) 3D-Mel, (<b>b</b>) 3D-MFCC, and (<b>c</b>) 3D fusion extraction. The confusion matrix uses the labels (0, 1, 2, 3, 4) to represent the recognition classification results and the shade of the color represents the probability.</p>
Full article ">Figure 16
<p>The recognition accuracy of multi-scale depthwise separable convolutional neural networks based on different attention mechanisms—SA, SE, RSA, RSE, and MSCA—for the DeepShip dataset. (<b>a</b>) SA; (<b>b</b>) SE; (<b>c</b>) RSA; (<b>d</b>) RSE; (<b>e</b>) MSCA.</p>
Full article ">Figure 17
<p>Comparison of different attention mechanisms—SA, SE, RSA, RSE, and MSCA—for the DeepShip dataset. Comparison of accuracy, precision, recall, and F1-score parameters.</p>
Full article ">Figure 18
<p>The evaluation of the recognition accuracy of various attention mechanisms within multi-scale depthwise separable convolutional neural networks for the ShipsEar dataset. It displays results for five mechanisms: (<b>a</b>) SE; (<b>b</b>) SA; (<b>c</b>) RSA; (<b>d</b>) RSE; and (<b>e</b>) MSCA. The confusion matrix uses the labels (0, 1, 2, 3) to represent the recognition classification results and the shade of the color represents the probability.</p>
Full article ">Figure 19
<p>The recognition accuracy of multi-scale depthwise separable convolutional neural networks based on different attention mechanisms—SA, SE, RSA, RSE, and MSCA—for the ShipsEar dataset. (<b>a</b>) SA; (<b>b</b>) SE; (<b>c</b>) RSA; (<b>d</b>) RSE; (<b>e</b>) MSCA.</p>
Full article ">Figure 20
<p>Comparison of different attention mechanisms—SA, SE, RSA, RSE, and MSCA—for the ShipsEar dataset. Comparison of accuracy, precision, recall, and F1-score parameters.</p>
Full article ">Figure 21
<p>The evaluation of the recognition accuracy of various attention mechanisms within multi-scale depthwise separable convolutional neural networks for the ShipsEar dataset. It displays results for five mechanisms: (<b>a</b>) SE; (<b>b</b>) SA; (<b>c</b>) RSA; (<b>d</b>) RSE; (<b>e</b>) MSCA. The confusion matrix uses the labels (0, 1, 2, 3, 4) to represent the recognition classification results and the shade of the color represents the probability.</p>
Full article ">
15 pages, 2167 KiB  
Article
Underwater Acoustic Target Recognition Based on Sub-Regional Feature Enhancement and Multi-Activated Channel Aggregation
by Zhongxiang Zheng and Peng Liu
J. Mar. Sci. Eng. 2024, 12(11), 1952; https://doi.org/10.3390/jmse12111952 - 31 Oct 2024
Viewed by 741
Abstract
Feature selection and fusion in ship radiated noise-based underwater target recognition have remained challenging tasks. This paper proposes a novel feature extraction method based on multi-dimensional feature selection and fusion. Redundant features are filtered through feature visualization techniques. The Sub-regional Feature Enhancement modules [...] Read more.
Feature selection and fusion in ship radiated noise-based underwater target recognition have remained challenging tasks. This paper proposes a novel feature extraction method based on multi-dimensional feature selection and fusion. Redundant features are filtered through feature visualization techniques. The Sub-regional Feature Enhancement modules (SFE) and Multi-activated Channel Aggregation modules (MCA) within the neural network are utilized to achieve underwater target recognition. Experimental results indicate that our network, named Sub-Regional Channel Aggregation Net (SRCA-Net), utilizing 3-s sound segments for ship radiated noise recognition, surpasses existing models, achieving an accuracy of 78.52% on the public DeepShip dataset. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>The Structure of SRCA-Net.</p>
Full article ">Figure 2
<p>12 feature extraction results for a particular cargo sample. “Index” represents a total of 12 mel ceps, “Pitch class” represents the pitch in the music, “Cqt_note” represents jianthe standard pitch after cqt transformation, and “Tonnetz” represents the tonal centroid point where the frequency is projected into the 6−D space.</p>
Full article ">Figure 3
<p>Results of feature visualisation using Grad-cam for the two combined methods.</p>
Full article ">Figure 4
<p>Sub-regional Feature Enhancement module schematic.</p>
Full article ">Figure 5
<p>Muti-activated channel aggregation module schematic.</p>
Full article ">Figure 6
<p>The confusion matrix of the experiment based on Cross Entropy Loss and Asymmetric Cross Entropy Loss.</p>
Full article ">
21 pages, 7017 KiB  
Article
Multi-Scale Frequency-Adaptive-Network-Based Underwater Target Recognition
by Lixu Zhuang, Afeng Yang, Yanxin Ma and David Day-Uei Li
J. Mar. Sci. Eng. 2024, 12(10), 1766; https://doi.org/10.3390/jmse12101766 - 5 Oct 2024
Viewed by 643
Abstract
Due to the complexity of underwater environments, underwater target recognition based on radiated noise has always been challenging. This paper proposes a multi-scale frequency-adaptive network for underwater target recognition. Based on the different distribution densities of Mel filters in the low-frequency band, a [...] Read more.
Due to the complexity of underwater environments, underwater target recognition based on radiated noise has always been challenging. This paper proposes a multi-scale frequency-adaptive network for underwater target recognition. Based on the different distribution densities of Mel filters in the low-frequency band, a three-channel improved Mel energy spectrum feature is designed first. Second, by combining a frequency-adaptive module, an attention mechanism, and a multi-scale fusion module, a multi-scale frequency-adaptive network is proposed to enhance the model’s learning ability. Then, the model training is optimized by introducing a time–frequency mask, a data augmentation strategy involving data confounding, and a focal loss function. Finally, systematic experiments were conducted based on the ShipsEar dataset. The results showed that the recognition accuracy for five categories reached 98.4%, and the accuracy for nine categories in fine-grained recognition was 88.6%. Compared with existing methods, the proposed multi-scale frequency-adaptive network for underwater target recognition has achieved significant performance improvement. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>The different design schemes for the Mel filter bank.</p>
Full article ">Figure 2
<p>The extraction process of the three-channel improved Mel energy spectrum.</p>
Full article ">Figure 3
<p>The Mel spectrogram after data augmentation.</p>
Full article ">Figure 4
<p>The structure of a multi-scale frequency-adaptive network.</p>
Full article ">Figure 5
<p>The structure of the frequency-adaptive algorithm.</p>
Full article ">Figure 6
<p>The frequency-adaptive residual module.</p>
Full article ">Figure 7
<p>A spatial compression and channel excitation module.</p>
Full article ">Figure 8
<p>Channel squeeze and spatial excitation module.</p>
Full article ">Figure 9
<p>A spatial and channel squeeze excitation module.</p>
Full article ">Figure 10
<p>A multi-scale fusion module.</p>
Full article ">Figure 11
<p>The impact of the frequency pooling kernel.</p>
Full article ">Figure 12
<p>Confusion matrix for five types of vessels.</p>
Full article ">Figure 13
<p>t-SNE results of different features: (<b>a</b>) t-SNE results of the Mel; (<b>b</b>) t-SNE results of the MFCC; (<b>c</b>) t-SNE results of the DF-Mel; (<b>d</b>) t-SNE results of the DF-MFCC; (<b>e</b>) t-SNE results of the 3C-IMFCC; (<b>f</b>) t-SNE results of the 3C-Imel.</p>
Full article ">Figure 14
<p>Experiments times with different models.</p>
Full article ">Figure 15
<p>t-SNE results in ShipsEar2.</p>
Full article ">Figure 16
<p>Comparative analysis of recognition results of ShipsEar2.</p>
Full article ">
19 pages, 5119 KiB  
Article
Estimation of Source Range and Location Using Ship-Radiated Noise Measured by Two Vertical Line Arrays with a Feed-Forward Neural Network
by Moon Ju Jo, Jee Woong Choi and Dong-Gyun Han
J. Mar. Sci. Eng. 2024, 12(9), 1665; https://doi.org/10.3390/jmse12091665 - 18 Sep 2024
Viewed by 972
Abstract
Machine learning-based source range estimation is a promising method for enhancing the performance of tracking both the dynamic and static positions of targets in the underwater acoustic environment using extensive training data. This study constructed a machine learning model for source range estimation [...] Read more.
Machine learning-based source range estimation is a promising method for enhancing the performance of tracking both the dynamic and static positions of targets in the underwater acoustic environment using extensive training data. This study constructed a machine learning model for source range estimation using ship-radiated noise recorded by two vertical line arrays (VLAs) during the Shallow-water Acoustic Variability Experiment (SAVEX-15), employing the Sample Covariance Matrix (SCM) and the Generalized Cross Correlation (GCC) as input features. A feed-forward neural network (FNN) was used to train the model on the acoustic characteristics of the source at various distances, and the range estimation results indicated that the SCM outperformed the GCC with lower error rates. Additionally, array tilt correction using the array invariant-based method improved range estimation accuracy. The impact of the training data composition corresponding to the bottom depth variation between the source and receivers on range estimation performance was also discussed. Furthermore, the estimated ranges from the two VLA locations were applied to localization using trilateration. Our results confirm that the SCM is the more appropriate feature for the FNN-based source range estimation model compared with the GCC and imply that ocean environment variability should be considered in developing a general-purpose machine learning model for underwater acoustics. Full article
(This article belongs to the Special Issue Applications of Underwater Acoustics in Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Bathymetry of the experimental area and ship track of the R/V Onnuri. Range variations in the R/V Onnuri (<b>b</b>) from VLA1 and (<b>c</b>) from VLA2 as a function of time. The black dotted line represents the ship track and range variations used for the training and validation data, the black solid line corresponds to the test data, and the black dashed lines in the range variations represent other ships. The red box represents the ship track corresponding to the acoustic data analyzed in <a href="#jmse-12-01665-f002" class="html-fig">Figure 2</a>.</p>
Full article ">Figure 2
<p>(<b>a</b>) Spectrogram of acoustic data measured from VLA1 at a depth of 25 m from May 25 21:35 to 21:59 UTC. The black thick dashed line represents the time of the closest point of approach (21:48 UTC). (<b>b</b>) The spectral probability density of acoustic data for 24 min. Thick magenta lines represents the average values of the mean intensity level (dashed) and the median spectrum level corresponding to the 50th percentile (solid).</p>
Full article ">Figure 3
<p>The estimated array tilt angle of the two VLAs, calculated every minute before (blue crosses) and after (orange circles) removing the outliers using a median filter with a window size of 10. (<b>a</b>) VLA1 and (<b>b</b>) VLA2.</p>
Full article ">Figure 4
<p>Source–receiver range estimation and localization results of the same VLA test. Range estimation results are shown for (<b>a</b>) the VLA1 test using the SCM, (<b>b</b>) the VLA2 test using the SCM, (<b>d</b>) the VLA1 test using the GCC, and (<b>e</b>) the VLA2 test using the GCC. Estimated ranges before and after tilt correction are represented by blue crosses and orange circles, respectively. Source localization results after tilt correction are represented by the purple circles in (<b>c</b>) when using the SCM and in (<b>f</b>) when using the GCC. The black solid line represents the ship track and range variations used for the test data. The black dashed line represents another ship during the test time.</p>
Full article ">Figure 5
<p>Source–receiver range estimation and localization results of the different VLA tests. Range estimation results are shown for (<b>a</b>) the VLA1 test using the SCM, (<b>b</b>) the VLA2 test using the SCM, (<b>d</b>) the VLA1 test using the GCC, and (<b>e</b>) the VLA2 test using the GCC. Estimated ranges before and after tilt correction are represented by blue crosses and orange circles, respectively. Source localization results after tilt correction are represented by the purple circles in (<b>c</b>) when using the SCM and in (<b>f</b>) when using the GCC. The black solid line represents the ship track and range variations used for the test data. The black dashed line represents another ship during the test time.</p>
Full article ">Figure 6
<p>Source–receiver range estimation and localization results of the large training data test. Range estimation results are shown for (<b>a</b>) the VLA1 test using the SCM, (<b>b</b>) the VLA2 test using the SCM, (<b>d</b>) the VLA1 test using the GCC, and (<b>e</b>) the VLA2 test using the GCC. Estimated ranges before and after tilt correction are represented by blue crosses and orange circles, respectively. Source localization results after tilt correction are represented by the purple circles in (<b>c</b>) when using the SCM and in (<b>f</b>) when using the GCC. The black solid line represents the ship track and range variations used for the test data. The black dashed line represents another ship during the test time.</p>
Full article ">Figure 7
<p>Variations in the relative bottom depth and the percentage error of the range estimation as a function of the time of the test data (feature: SCM after tilt correction). (<b>a</b>) The same VLA test, (<b>b</b>) the different VLA test, and (<b>c</b>) the large training data test. Relative bottom depth and percentage error are represented by blue crosses and orange circles, respectively.</p>
Full article ">Figure 8
<p>The training data distribution as a function of the average bottom depth and the source range obtained from VLA1 and VLA2: (<b>a</b>) before data resampling and (<b>b</b>) after data resampling.</p>
Full article ">Figure A1
<p>MAPEs of estimated ranges according to the array tilt change.</p>
Full article ">Figure A2
<p>Estimated ranges when the relative bottom depths are (<b>a</b>) −3 m, (<b>b</b>) −2 m, (<b>c</b>) −1 m, (<b>d</b>) 1 m, (<b>e</b>) 2 m, and (<b>f</b>) 3 m.</p>
Full article ">Figure A3
<p>Box plot of percentage errors of the estimated ranges according to the relative bottom depth when the input feature is (<b>a</b>) the SCM or (<b>b</b>) the GCC. Within each box, horizontal lines denote median values; boxes extend from the 25th percentile to the 75th percentile of percentage errors; vertical extending lines denote adjacent values (i.e., the most extreme values within the 1.5 interquartile range of the 25th and 75th percentiles of percentage errors); dots denote observations outside the range of adjacent values. The orange line denotes the mean value.</p>
Full article ">
18 pages, 4081 KiB  
Article
A Dual-Stream Deep Learning-Based Acoustic Denoising Model to Enhance Underwater Information Perception
by Wei Gao, Yining Liu and Desheng Chen
Remote Sens. 2024, 16(17), 3325; https://doi.org/10.3390/rs16173325 - 8 Sep 2024
Viewed by 3683
Abstract
Estimating the line spectra of ship-radiated noise is a crucial remote sensing technique for detecting and recognizing underwater acoustic targets. Improving the signal-to-noise ratio (SNR) makes the low-frequency components of the target signal more prominent. This enhancement aids in the detection of underwater [...] Read more.
Estimating the line spectra of ship-radiated noise is a crucial remote sensing technique for detecting and recognizing underwater acoustic targets. Improving the signal-to-noise ratio (SNR) makes the low-frequency components of the target signal more prominent. This enhancement aids in the detection of underwater acoustic signals using sonar. Based on the characteristics of low-frequency narrow-band line spectra signals in underwater target radiated noise, we propose a dual-stream deep learning network with frequency characteristics transformation (DS_FCTNet) for line spectra estimation. The dual streams predict amplitude and phase masks separately and use an information exchange module to swap learn features between the amplitude and phase spectra, aiding in better phase information reconstruction and signal denoising. Additionally, a frequency characteristics transformation module is employed to extract convolutional features between channels, obtaining global correlations of the amplitude spectrum and enhancing the ability to learn target signal features. Through experimental analysis on ShipsEar, a dataset of underwater acoustic signals by hydrophones deployed in shallow water, the effectiveness and rationality of different modules within DS_FCTNet are verified.Under low SNR conditions and with unknown ship types, the proposed DS_FCTNet model exhibits the best line spectrum enhancement compared to methods such as SEGAN and DPT_FSNet. Specifically, SDR and SSNR are improved by 14.77 dB and 13.58 dB, respectively, enabling the detection of weaker target signals and laying the foundation for target localization and recognition applications. Full article
Show Figures

Figure 1

Figure 1
<p>The T-F domain of clean signal and its mixed signal after adding noise.</p>
Full article ">Figure 2
<p>The PSD of clean signal and its mixed signal after adding noise.</p>
Full article ">Figure 3
<p>The architecture of the proposed DS_FCTNet model.</p>
Full article ">Figure 4
<p>The diagram of the encoder. Above is the amplitude encoding layer, and below is the phase encoding layer.</p>
Full article ">Figure 5
<p>The diagram of DSB, including the amplitude-stream block, phase-stream block, and communication.</p>
Full article ">Figure 6
<p>The diagram of FCT.</p>
Full article ">Figure 7
<p>The diagram of the Decoder Layer. Above is the amplitude encoding layer, and below is the phase encoding layer.</p>
Full article ">Figure 8
<p>The arrangement of three DSB modules.</p>
Full article ">Figure 9
<p>Comparison of the clean signal with the denoised signal using ablation experiments in the T-F domain.</p>
Full article ">Figure 10
<p>Comparison of the clean signal with the denoised signal using ablation experiments in PSD.</p>
Full article ">Figure 11
<p>Comparison of the clean signal with the denoised signal using methods in the T-F domain on Dataset-I.</p>
Full article ">Figure 12
<p>Comparison of the clean signal with the denoised signal using methods in PSD on Dataset-I.</p>
Full article ">Figure 13
<p>Comparison of the clean signal with the denoised signal using ablation experiments in the time domain.</p>
Full article ">Figure 14
<p>Comparison of the clean signal with the denoised signal using methods in the T-F domain on Dataset-II.</p>
Full article ">Figure 15
<p>Comparison of the clean signal with the denoised signal using methods in PSD on Dataset-II.</p>
Full article ">Figure 16
<p>Comparison of the clean signal with the denoised signal using methods for unknown ship types in PSD on Dataset-III.</p>
Full article ">Figure 17
<p>Comparison of the clean signal with the denoised signal using methods for unknown ship types in the T-F domain on Dataset-III.</p>
Full article ">
17 pages, 2450 KiB  
Article
Modeling the Underwater Sound of Floating Offshore Windfarms in the Central Mediterranean Sea
by Marzia Baldachini, Robin D. J. Burns, Giuseppa Buscaino, Elena Papale, Roberto Racca, Michael A. Wood and Federica Pace
J. Mar. Sci. Eng. 2024, 12(9), 1495; https://doi.org/10.3390/jmse12091495 - 29 Aug 2024
Viewed by 1258
Abstract
In the shift toward sustainable energy production, offshore wind power has experienced notable expansion. Several projects to install floating offshore wind farms in European waters, ranging from a few to hundreds of turbines, are currently in the planning stage. The underwater operational sound [...] Read more.
In the shift toward sustainable energy production, offshore wind power has experienced notable expansion. Several projects to install floating offshore wind farms in European waters, ranging from a few to hundreds of turbines, are currently in the planning stage. The underwater operational sound generated by these floating turbines has the potential to affect marine ecosystems, although the extent of this impact remains underexplored. This study models the sound radiated by three planned floating wind farms in the Strait of Sicily (Italy), an area of significant interest for such developments. These wind farms vary in size (from 250 MW to 2800 MW) and environmental characteristics, including bathymetry and seabed substrates. Propagation losses were modeled in one-third-octave bands using JASCO Applied Sciences’ Marine Operations Noise Model, which is based on the parabolic equation method, combined with the BELLHOP beam-tracing model. Two sound speed profiles, corresponding to winter and summer, were applied to simulate seasonal variations in sound propagation. Additionally, sound from an offshore supply ship was incorporated with one of these wind farms to simulate maintenance operations. Results indicate that sound from operating wind farms could reach a broadband sound pressure level (Lp) of 100 dB re 1 µPa as far as 67 km from the wind farm. Nevertheless, this sound level is generally lower than the ambient sound in areas with intense shipping traffic. The findings are discussed in relation to local background sound levels and current guidelines and regulations. The implications for environmental management include the need for comprehensive monitoring and mitigation strategies to protect marine ecosystems from potential acoustic disturbances. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>In the top left image, featuring the Mediterranean basin, highlighting the map part that is enlarged in the image on the top right, are all modeled wind farms, with each yellow dot representing a turbine. On the bottom left is the layout of Med Wind, with 190 turbines (A1 in green; A2 in yellow; A3 in red), and Hannibal, with 21 turbines. On the bottom right is the layout of Sicily South, with 48 turbines on 24 foundations.</p>
Full article ">Figure 2
<p>Modeled radiated sound fields (L<sub>p</sub>) from Hannibal and Sicily South. February sound speed profile was applied to obtain the image on the left, and August was applied on the right. In color scale, sound levels (L<sub>p</sub>) are represented from 100 dB up to the maximum per wind farm, as reported in the legend.</p>
Full article ">Figure 3
<p>Zoomed modeled radiated sound fields (L<sub>p</sub>) from Hannibal, with February sound speed profile.</p>
Full article ">Figure 4
<p>Modeled radiated sound field (L<sub>p</sub>) from Med Wind. February sound speed profile on the left; August sound speed profile on the right.</p>
Full article ">Figure 5
<p>Modeled radiated sound field (L<sub>p</sub>) from operational Med Wind plus a support ship on dynamic positioning.</p>
Full article ">
16 pages, 5334 KiB  
Article
An Auditory Convolutional Neural Network for Underwater Acoustic Target Timbre Feature Extraction and Recognition
by Junshuai Ni, Fang Ji, Shaoqing Lu and Weijia Feng
Remote Sens. 2024, 16(16), 3074; https://doi.org/10.3390/rs16163074 - 21 Aug 2024
Cited by 1 | Viewed by 888
Abstract
In order to extract the line-spectrum features of underwater acoustic targets in complex environments, an auditory convolutional neural network (ACNN) with the ability of frequency component perception, timbre perception and critical information perception is proposed in this paper inspired by the human auditory [...] Read more.
In order to extract the line-spectrum features of underwater acoustic targets in complex environments, an auditory convolutional neural network (ACNN) with the ability of frequency component perception, timbre perception and critical information perception is proposed in this paper inspired by the human auditory perception mechanism. This model first uses a gammatone filter bank that mimics the cochlear basilar membrane excitation response to decompose the input time-domain signal into a number of sub-bands, which guides the network to perceive the line-spectrum frequency information of the underwater acoustic target. A sequence of convolution layers is then used to filter out interfering noise and enhance the line-spectrum components of each sub-band by simulating the process of calculating the energy distribution features, after which the improved channel attention module is connected to select line spectra that are more critical for recognition, and in this module, a new global pooling method is proposed and applied in order to better extract the intrinsic properties. Finally, the sub-band information is fused using a combination layer and a single-channel convolution layer to generate a vector with the same dimensions as the input signal at the output layer. A decision module with a Softmax classifier is added behind the auditory neural network and used to recognize the five classes of vessel targets in the ShipsEar dataset, achieving a recognition accuracy of 99.8%, which is improved by 2.7% compared to the last proposed DRACNN method, and there are different degrees of improvement over the other eight compared methods. The visualization results show that the model can significantly suppress the interfering noise intensity and selectively enhance the radiated noise line-spectrum energy of underwater acoustic targets. Full article
(This article belongs to the Topic AI and Data-Driven Advancements in Industry 4.0)
Show Figures

Figure 1

Figure 1
<p>Time-frequency line-spectra diagrams. (<b>a</b>) Time-frequency diagram of the small boat. (<b>b</b>) Time-frequency diagram of the test vessel in a stationary state with only auxiliary machinery operation. (<b>c</b>) Time-frequency diagram of the fishing vessel with shaft system failure. (<b>d</b>) Time-frequency diagram of the motor boat-radiated noise when its propeller is rotating at a high speed.</p>
Full article ">Figure 2
<p>ACNN model structure.</p>
Full article ">Figure 3
<p>The frequency magnitude responses of the gammatone filter bank.</p>
Full article ">Figure 4
<p>Channel attention mechanism.</p>
Full article ">Figure 5
<p>Structure of the global pooling layer.</p>
Full article ">Figure 6
<p>ACNN_DRACNN model structure.</p>
Full article ">Figure 7
<p>Training curves of ACNN_DRACNN model.</p>
Full article ">Figure 8
<p>Confusion matrix for recognizing results when batch size is 64.</p>
Full article ">Figure 9
<p>Cost function value of the model on validation dataset.</p>
Full article ">Figure 10
<p>Recognition accuracy of the model on the validation dataset.</p>
Full article ">Figure 11
<p>Power spectrum of input data and output data. (<b>a</b>) Sample of category A. (<b>b</b>) Sample of category B. (<b>c</b>) Sample of category C. (<b>d</b>) Sample of category D. (<b>e</b>) Sample of category E.</p>
Full article ">Figure 12
<p>Data visualization by TSNE. (<b>a</b>) Raw signals of ShipsEar dataset. (<b>b</b>) Output data of the ACNN model.</p>
Full article ">
19 pages, 11704 KiB  
Article
A Method for Underwater Acoustic Target Recognition Based on the Delay-Doppler Joint Feature
by Libin Du, Zhengkai Wang, Zhichao Lv, Dongyue Han, Lei Wang, Fei Yu and Qing Lan
Remote Sens. 2024, 16(11), 2005; https://doi.org/10.3390/rs16112005 - 2 Jun 2024
Cited by 2 | Viewed by 1424
Abstract
With the aim of solving the problem of identifying complex underwater acoustic targets using a single signal feature in the Time–Frequency (TF) feature, this paper designs a method that recognizes the underwater targets based on the Delay-Doppler joint feature. First, this method uses [...] Read more.
With the aim of solving the problem of identifying complex underwater acoustic targets using a single signal feature in the Time–Frequency (TF) feature, this paper designs a method that recognizes the underwater targets based on the Delay-Doppler joint feature. First, this method uses symplectic finite Fourier transform (SFFT) to extract the Delay-Doppler features of underwater acoustic signals, analyzes the Time–Frequency features at the same time, and combines the Delay-Doppler (DD) feature and Time–Frequency feature to form a joint feature (TF-DD). This paper uses three types of convolutional neural networks to verify that TF-DD can effectively improve the accuracy of target recognition. Secondly, this paper designs an object recognition model (TF-DD-CNN) based on joint features as input, which simplifies the neural network’s overall structure and improves the model’s training efficiency. This research employs ship-radiated noise to validate the efficacy of TF-DD-CNN for target identification. The results demonstrate that the combined characteristic and the TF-DD-CNN model introduced in this study can proficiently detect ships, and the model notably enhances the precision of detection. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Signal Time–Frequency domain analysis process.</p>
Full article ">Figure 2
<p>The conversion relationship between the signal’s Time–Frequency and Delay-Doppler domains.</p>
Full article ">Figure 3
<p>Relationship between Time–Frequency and Delay-Doppler domains.</p>
Full article ">Figure 4
<p>The principle of the convolutional neural network.</p>
Full article ">Figure 5
<p>Time domain feature of an Oceanliner.</p>
Full article ">Figure 6
<p>Frequency domain feature of an Oceanliner.</p>
Full article ">Figure 7
<p>Time–Frequency domain feature of an Oceanliner.</p>
Full article ">Figure 8
<p>Delay-Doppler feature of an Oceanliner.</p>
Full article ">Figure 9
<p>Three-dimensional distribution of the Delay-Doppler feature.</p>
Full article ">Figure 10
<p>Principle of target recognition based on joint features.</p>
Full article ">Figure 11
<p>The structure of the TF-DD-CNN model.</p>
Full article ">Figure 12
<p>Structure of the Feature Fusion module.</p>
Full article ">Figure 13
<p>Radiated noise data of passenger ship in the first 20 s.</p>
Full article ">Figure 14
<p>Photos of different ship types and ambient marine noise.</p>
Full article ">Figure 15
<p>Radiated noise data of the passenger ship in the first 20 s.</p>
Full article ">Figure 16
<p>Framing of the radiated noise signal.</p>
Full article ">Figure 17
<p>Time domain data framing results.</p>
Full article ">Figure 18
<p>Delay-Doppler (<b>a</b>) and Time–Frequency (<b>b</b>) features of a passenger ship.</p>
Full article ">Figure 19
<p>Distribution of the loss function.</p>
Full article ">Figure 20
<p>Distribution of the prediction accuracy of the three models.</p>
Full article ">Figure 21
<p>Loss function of the TF-DD-CNN model.</p>
Full article ">Figure 22
<p>Distribution of the prediction accuracy of the four models.</p>
Full article ">
24 pages, 7124 KiB  
Article
Pressure Fluctuation and Flow-Induced Noise of the Fin and Rudder in a Water Tunnel
by Duo Qu, Yanfei Li, Ruibiao Li, Yunhui Chen and Yongou Zhang
Appl. Sci. 2024, 14(11), 4691; https://doi.org/10.3390/app14114691 - 29 May 2024
Viewed by 815
Abstract
The flow field and radiated noise resulting from water flowing through a fin and rudder were analyzed in this study. A hydrodynamic experiment was conducted in a water tunnel to measure the pressure fluctuations affecting a fin and rudder, and then the experimental [...] Read more.
The flow field and radiated noise resulting from water flowing through a fin and rudder were analyzed in this study. A hydrodynamic experiment was conducted in a water tunnel to measure the pressure fluctuations affecting a fin and rudder, and then the experimental data and Large Eddy Simulation (LES) results were compared and analyzed. The discussion presented herein focuses on the zero angle of attack and the Reynolds number based on a maximum width of the fin and rudder ranging from 3.6 × 106 to 9.7 × 106. Furthermore, a numerical model was developed using the LES turbulence model and Lighthill’s acoustic analog theory to predict the flow-induced noise generated by the fin and rudder. The test data reveal that the pressure fluctuation decreases as frequency increases, and the average rate of decrease is obtained for frequencies up to 5.0 kHz. Additionally, as flow velocity increases, the overall sound pressure level of flow-induced noise also increases. The relationship between the sound power radiated by the fin and rudder and the flow velocity approximately follows a power law with an exponent of seven, and the noise radiated on both sides is greater than that radiated in the direction of flow. The findings presented in this paper have practical implications for designing quieter rudders and optimizing the noise performance of underwater vehicles and ships, thereby addressing concerns regarding the impact of anthropogenic noise on marine life and ecosystems. Full article
(This article belongs to the Section Fluid Science and Technology)
Show Figures

Figure 1

Figure 1
<p>The side view of the fin and rudder.</p>
Full article ">Figure 2
<p>The fin and rudder installed in the water tunnel: (<b>a</b>) side view of the fin; (<b>b</b>) side view of the rudder.</p>
Full article ">Figure 3
<p>The schematic diagram of the water tunnel.</p>
Full article ">Figure 4
<p>Diagram of the pressure fluctuation test system.</p>
Full article ">Figure 5
<p>Positional diagram of the pressure sensors used in the experiment.</p>
Full article ">Figure 6
<p>Flow domain and boundary conditions in the flow field simulation.</p>
Full article ">Figure 7
<p>Acoustic domain in the sound field simulation.</p>
Full article ">Figure 8
<p>Sectional view of medium CFD mesh.</p>
Full article ">Figure 9
<p>Sectional view of the local mesh and details of the gap between the fin and the rudder.</p>
Full article ">Figure 10
<p>Comparison of the time-domain and frequency-domain curves of pressure fluctuation at monitoring point #1: (<b>a</b>) time-domain; (<b>b</b>) frequency-domain.</p>
Full article ">Figure 11
<p>Comparison of the pressure field for different flow velocities in the XZ plane: (<b>a</b>) 3 m/s; (<b>b</b>) 5 m/s; (<b>c</b>) 8 m/s.</p>
Full article ">Figure 12
<p>Comparison of the pressure field for different flow velocities in the XY plane: (<b>a</b>) 3 m/s; (<b>b</b>) 5 m/s; (<b>c</b>) 8 m/s.</p>
Full article ">Figure 13
<p>Velocity field and streamline for different flow velocities in the XZ plane: (<b>a</b>) 3 m/s, Re ≈ 3.6 × 10<sup>6</sup>; (<b>b</b>) 5 m/s, Re ≈ 6.1 × 10<sup>6</sup>; (<b>c</b>) 8 m/s, Re ≈ 9.7 × 10<sup>6</sup>.</p>
Full article ">Figure 14
<p>Velocity field and streamline for different flow velocities in the XY plane: (<b>a</b>) 3 m/s, Re ≈ 3.6 × 10<sup>6</sup>; (<b>b</b>) 5 m/s, Re ≈ 6.1 × 10<sup>6</sup>; (<b>c</b>) 8 m/s, Re ≈ 9.7 × 10<sup>6</sup>.</p>
Full article ">Figure 15
<p>Vorticity distribution for different flow velocities (Q = 12 s<sup>−1</sup>): (<b>a</b>) 3 m/s; (<b>b</b>) 5 m/s; (<b>c</b>) 8 m/s.</p>
Full article ">Figure 16
<p>Comparison of simulated and experimental pressure fluctuations in the 100–5000 Hz range at 3 m/s.</p>
Full article ">Figure 16 Cont.
<p>Comparison of simulated and experimental pressure fluctuations in the 100–5000 Hz range at 3 m/s.</p>
Full article ">Figure 17
<p>Comparison of simulated and experimental pressure fluctuations in the 100–5000 Hz range at 5 m/s.</p>
Full article ">Figure 17 Cont.
<p>Comparison of simulated and experimental pressure fluctuations in the 100–5000 Hz range at 5 m/s.</p>
Full article ">Figure 18
<p>Comparison of simulated and experimental pressure fluctuations in the 100–5000 Hz range at 8 m/s.</p>
Full article ">Figure 18 Cont.
<p>Comparison of simulated and experimental pressure fluctuations in the 100–5000 Hz range at 8 m/s.</p>
Full article ">Figure 19
<p>Sound pressure contours in the XZ plane at different frequencies: (<b>a</b>) 100 Hz; (<b>b</b>) 500 Hz; (<b>c</b>) 1000 Hz.</p>
Full article ">Figure 20
<p>Sound pressure contours in the XY plane at different frequencies: (<b>a</b>) 100 Hz; (<b>b</b>) 500 Hz; (<b>c</b>) 1000 Hz.</p>
Full article ">Figure 20 Cont.
<p>Sound pressure contours in the XY plane at different frequencies: (<b>a</b>) 100 Hz; (<b>b</b>) 500 Hz; (<b>c</b>) 1000 Hz.</p>
Full article ">Figure 21
<p>Sound pressure contours in the YZ plane at different frequencies: (<b>a</b>) 100 Hz; (<b>b</b>) 500 Hz; (<b>c</b>) 1000 Hz.</p>
Full article ">Figure 22
<p>Directivity of flow-induced noise at 5 m/s: (<b>a</b>) XY plane; (<b>b</b>) XZ plane; (<b>c</b>) YZ plane.</p>
Full article ">Figure 23
<p>RNL of the flow-induced noise at a monitoring point: (<b>a</b>) narrow band (20–1000 Hz); (<b>b</b>) third-octave band.</p>
Full article ">
17 pages, 9898 KiB  
Article
Ship-Radiated Noise Separation in Underwater Acoustic Environments Using a Deep Time-Domain Network
by Qunyi He, Haitao Wang, Xiangyang Zeng and Anqi Jin
J. Mar. Sci. Eng. 2024, 12(6), 885; https://doi.org/10.3390/jmse12060885 - 26 May 2024
Cited by 1 | Viewed by 1109
Abstract
Ship-radiated noise separation is critical in both military and economic domains. However, due to the complex underwater environments with multiple noise sources and reverberation, separating ship-radiated noise poses a significant challenge. Traditionally, underwater acoustic signal separation has employed blind source separation methods based [...] Read more.
Ship-radiated noise separation is critical in both military and economic domains. However, due to the complex underwater environments with multiple noise sources and reverberation, separating ship-radiated noise poses a significant challenge. Traditionally, underwater acoustic signal separation has employed blind source separation methods based on independent component analysis. Recently, the separation of underwater acoustic signals has been approached as a deep learning problem. This involves learning the features of ship-radiated noise from training data. This paper introduces a deep time-domain network for ship-radiated noise separation by leveraging the power of parallel dilated convolution and group convolution. The separation layer employs parallel dilated convolution operations with varying expansion factors to better extract low-frequency features from the signal envelope while preserving detailed information. Additionally, we use group convolution to reduce the expansion of network size caused by parallel convolution operations, enabling the network to maintain a smaller size and computational complexity while achieving good separation performance. The proposed approach is demonstrated to be superior to the other common networks in the DeepShip dataset through comprehensive comparisons. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Figure 1
<p>The framework of the network.</p>
Full article ">Figure 2
<p>The process of encoding and decoding: (<b>a</b>) the process of encoding; (<b>b</b>) the process of decoding.</p>
Full article ">Figure 3
<p>The design of the separation layer.</p>
Full article ">Figure 4
<p>The design of a parallel dilated group convolutional unit.</p>
Full article ">Figure 5
<p>The performance of the model in each epoch.</p>
Full article ">Figure 6
<p>The time–frequency plots of the mixture, the ground truth, and the separated results: (<b>a</b>) mixture; (<b>b</b>–<b>e</b>) the ground truth of the tug signal and the separated results obtained from Res-UNet, UNet, and the proposed method, respectively; (<b>f</b>–<b>i</b>) the ground truth of the passenger ship signal and the separated results obtained from Res-UNet, UNet, and the proposed method, respectively.</p>
Full article ">Figure 6 Cont.
<p>The time–frequency plots of the mixture, the ground truth, and the separated results: (<b>a</b>) mixture; (<b>b</b>–<b>e</b>) the ground truth of the tug signal and the separated results obtained from Res-UNet, UNet, and the proposed method, respectively; (<b>f</b>–<b>i</b>) the ground truth of the passenger ship signal and the separated results obtained from Res-UNet, UNet, and the proposed method, respectively.</p>
Full article ">Figure 7
<p>The generated masks in the separation process: (<b>a</b>) the generated mask used for separating the tug signal; (<b>b</b>) the generated mask used for separating the passenger ship signal.</p>
Full article ">Figure 8
<p>Scatter-plots of SNR, SegSNR, SNRi, and SI-SNRi for different methods in the DeepShip dataset: (<b>a</b>–<b>d</b>) scatter-plots of SNR, SegSNR, SNRi, and SI-SNRi for Res-UNet in the DeepShip dataset, respectively; (<b>e</b>–<b>h</b>) scatter-plots of SNR, SegSNR, SNRi, and SI-SNRi for UNet in the DeepShip dataset, respectively; (<b>i</b>–<b>l</b>) scatter-plots of SNR, SegSNR, SNRi, and SI-SNRi for the proposed method in the DeepShip dataset, respectively.</p>
Full article ">Figure 9
<p>Quantile–quantile plots of SNR, SegSNR, SNRi, and SI-SNRi for different methods in the DeepShip dataset, the red dotted line represents the direct correspondence between the two distributions. (<b>a</b>–<b>d</b>) quantile–quantile plots of SNR, SegSNR, SNRi, and SI-SNRi for Res-UNet in the DeepShip dataset, respectively; (<b>e</b>–<b>h</b>) quantile–quantile of SNR, SegSNR, SNRi, and SI-SNRi for UNet in the DeepShip dataset, respectively; (<b>i</b>–<b>l</b>) quantile–quantile plots of SNR, SegSNR, SNRi, and SI-SNRi for the proposed method in the DeepShip dataset, respectively.</p>
Full article ">
13 pages, 2602 KiB  
Article
Deep Learning Based Underwater Acoustic Target Recognition: Introduce a Recent Temporal 2D Modeling Method
by Jun Tang, Wenbo Gao, Enxue Ma, Xinmiao Sun and Jinying Ma
Sensors 2024, 24(5), 1633; https://doi.org/10.3390/s24051633 - 2 Mar 2024
Viewed by 1694
Abstract
In recent years, the application of deep learning models for underwater target recognition has become a popular trend. Most of these are pure 1D models used for processing time-domain signals or pure 2D models used for processing time-frequency spectra. In this paper, a [...] Read more.
In recent years, the application of deep learning models for underwater target recognition has become a popular trend. Most of these are pure 1D models used for processing time-domain signals or pure 2D models used for processing time-frequency spectra. In this paper, a recent temporal 2D modeling method is introduced into the construction of ship radiation noise classification models, combining 1D and 2D. This method is based on the periodic characteristics of time-domain signals, shaping them into 2D signals and discovering long-term correlations between sampling points through 2D convolution to compensate for the limitations of 1D convolution. Integrating this method with the current state-of-the-art model structure and using samples from the Deepship database for network training and testing, it was found that this method could further improve the accuracy (0.9%) and reduce the parameter count (30%), providing a new option for model construction and optimization. Meanwhile, the effectiveness of training models using time-domain signals or time-frequency representations has been compared, finding that the model based on time-domain signals is more sensitive and has a smaller storage footprint (reduced to 30%), whereas the model based on time-frequency representation can achieve higher accuracy (1–2%). Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>Various UATR pipelines, (<b>a</b>) is based on Machine Learning (ML), (<b>b</b>) is based on DNN adopting pattern recognition mode, (<b>c</b>) is also based on deep learning but adopting end-to-end pattern.</p>
Full article ">Figure 2
<p>Partial structure of TimesNet: (<b>a</b>) Algorithm diagram to discover multi periods; (<b>b</b>) Structure of Timesblock, which performs the transition from <math display="inline"><semantics> <mrow> <mn>1</mn> <mi mathvariant="normal">D</mi> <mo>→</mo> <mn>2</mn> <mi mathvariant="normal">D</mi> <mo>→</mo> <mn>1</mn> <mi mathvariant="normal">D</mi> </mrow> </semantics></math>, and reshapes (2D modeling) and reshapes back represent from <math display="inline"><semantics> <mrow> <mn>1</mn> <mi mathvariant="normal">D</mi> <mo>→</mo> <mn>2</mn> <mi mathvariant="normal">D</mi> </mrow> </semantics></math> and from <math display="inline"><semantics> <mrow> <mn>2</mn> <mi mathvariant="normal">D</mi> <mo>→</mo> <mn>1</mn> <mi mathvariant="normal">D</mi> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 3
<p>Waveform (above, frames with durations of 0.1 s) and T-F representation (below, segments with durations of 4 s) of the four ship categories. From left to right are Cargo, Tanker, Tug, and Passenger. The sampling points were standardized.</p>
Full article ">Figure 4
<p>Example of accuracy and loss variation curve. The blue point represents the value of an epoch and it can be found that the convergence of the model was satisfactory.</p>
Full article ">Figure 5
<p>Block structures of each model used in the experiment. CAM represents the channel attention mechanism, and SAM represents the spatial attention mechanism.</p>
Full article ">Figure 6
<p>Blocks with Timesblocks. (<b>a</b>) SE ResNet and (<b>b</b>) MSRDN.</p>
Full article ">Figure 7
<p>Examples of thermal maps. (<b>a</b>–<b>d</b>) are Cargo, Tanker, Tug, and Passenger, respectively.</p>
Full article ">Figure 8
<p>Confusion matrices. The darker the green, the higher the probability. The first to fourth rows are Cargo, Tanker, Tug, and Passenger, respectively, and from left to right are T-F-MSRDN, T-F-SE ResNet, T-SE ResNet with Timesblocks, and T-MSRDN with Timesblocks.</p>
Full article ">Figure 9
<p>t-SNE visualization. Class 1.0 to 4.0 are Cargo, Tanker, Tug, and Passenger, respectively.</p>
Full article ">
14 pages, 708 KiB  
Communication
Underwater Acoustic Nonlinear Blind Ship Noise Separation Using Recurrent Attention Neural Networks
by Ruiping Song, Xiao Feng, Junfeng Wang, Haixin Sun, Mingzhang Zhou and Hamada Esmaiel
Remote Sens. 2024, 16(4), 653; https://doi.org/10.3390/rs16040653 - 9 Feb 2024
Cited by 5 | Viewed by 1686
Abstract
Ship-radiated noise is the main basis for ship detection in underwater acoustic environments. Due to the increasing human activity in the ocean, the captured ship noise is usually mixed with or covered by other signals or noise. On the other hand, due to [...] Read more.
Ship-radiated noise is the main basis for ship detection in underwater acoustic environments. Due to the increasing human activity in the ocean, the captured ship noise is usually mixed with or covered by other signals or noise. On the other hand, due to the softening effect of bubbles in the water generated by ships, ship noise undergoes non-negligible nonlinear distortion. To mitigate the nonlinear distortion and separate the target ship noise, blind source separation (BSS) becomes a promising solution. However, underwater acoustic nonlinear models are seldom used in research for nonlinear BSS. This paper is based on the hypothesis that the recovery and separation accuracy can be improved by considering this nonlinear effect in the underwater environment. The purpose of this research is to explore and discover a method with the above advantages. In this paper, a model is used in underwater BSS to describe the nonlinear impact of the softening effect of bubbles on ship noise. To separate the target ship-radiated noise from the nonlinear mixtures, an end-to-end network combining an attention mechanism and bidirectional long short-term memory (Bi-LSTM) recurrent neural network is proposed. Ship noise from the database ShipsEar and line spectrum signals are used in the simulation. The simulation results show that, compared with several recent neural networks used for linear and nonlinear BSS, the proposed scheme has an advantage in terms of the mean square error, correlation coefficient and signal-to-distortion ratio. Full article
(This article belongs to the Special Issue Advanced Array Signal Processing for Target Imaging and Detection)
Show Figures

Figure 1

Figure 1
<p>The sketch of the PNL model used in this paper.</p>
Full article ">Figure 2
<p>The model of the proposed recurrent attention neural network, containing two layers of recurrent attention neural network and an LSTM layer.</p>
Full article ">Figure 3
<p>Spectra of original and mixed ship noise, with characteristics severely hidden in the mixture spectra.</p>
Full article ">Figure 4
<p>Spectra of original and mixed line spectrum signals, with line spectra characteristics heavily distorted in mixtures.</p>
Full article ">Figure 4 Cont.
<p>Spectra of original and mixed line spectrum signals, with line spectra characteristics heavily distorted in mixtures.</p>
Full article ">Figure 5
<p>Fragment 1 of original and separated waveform, with the proposed network performing the best.</p>
Full article ">Figure 6
<p>Fragment 2 of original and separated waveform, with the proposed network performing the best.</p>
Full article ">
16 pages, 1471 KiB  
Article
Cross-Domain Contrastive Learning-Based Few-Shot Underwater Acoustic Target Recognition
by Xiaodong Cui, Zhuofan He, Yangtao Xue, Keke Tang, Peican Zhu and Jing Han
J. Mar. Sci. Eng. 2024, 12(2), 264; https://doi.org/10.3390/jmse12020264 - 1 Feb 2024
Cited by 5 | Viewed by 1491
Abstract
Underwater Acoustic Target Recognition (UATR) plays a crucial role in underwater detection devices. However, due to the difficulty and high cost of collecting data in the underwater environment, UATR still faces the problem of small datasets. Few-shot learning (FSL) addresses this challenge through [...] Read more.
Underwater Acoustic Target Recognition (UATR) plays a crucial role in underwater detection devices. However, due to the difficulty and high cost of collecting data in the underwater environment, UATR still faces the problem of small datasets. Few-shot learning (FSL) addresses this challenge through techniques such as Siamese networks and prototypical networks. However, it also suffers from the issue of overfitting, which leads to catastrophic forgetting and performance degradation. Current underwater FSL methods primarily focus on mining similar information within sample pairs, ignoring the unique features of ship radiation noise. This study proposes a novel cross-domain contrastive learning-based few-shot (CDCF) method for UATR to alleviate overfitting issues. This approach leverages self-supervised training on both source and target domains to facilitate rapid adaptation to the target domain. Additionally, a base contrastive module is introduced. Positive and negative sample pairs are generated through data augmentation, and the similarity in the corresponding frequency bands of feature embedding is utilized to learn fine-grained features of ship radiation noise, thereby expanding the scope of knowledge in the source domain. We evaluate the performance of CDCF in diverse scenarios on ShipsEar and DeepShip datasets. The experimental results indicate that in cross-domain environments, the model achieves accuracy rates of 56.71%, 73.02%, and 76.93% for 1-shot, 3-shot, and 5-shot scenarios, respectively, outperforming other FSL methods. Moreover, the model demonstrates outstanding performance in noisy environments. Full article
(This article belongs to the Special Issue Underwater Wireless Communications: Recent Advances and Challenges)
Show Figures

Figure 1

Figure 1
<p>Traditional few-shot learning framework. The model consists of two phases: pre-training and fine-tuning. In the pre-training stage, the feature extractor <math display="inline"><semantics> <msub> <mi>F</mi> <mi>ϕ</mi> </msub> </semantics></math> and classifier <math display="inline"><semantics> <msub> <mi>G</mi> <mi>θ</mi> </msub> </semantics></math> are trained on the source domain dataset. In the fine-tuning stage, the parameters of the feature extractor <math display="inline"><semantics> <msub> <mi>F</mi> <msup> <mi>ϕ</mi> <mo>′</mo> </msup> </msub> </semantics></math> are transferred from <math display="inline"><semantics> <msub> <mi>F</mi> <mi>ϕ</mi> </msub> </semantics></math>, while the parameters of the classifier <math display="inline"><semantics> <msub> <mi>G</mi> <msup> <mi>θ</mi> <mo>′</mo> </msup> </msub> </semantics></math> are randomly initialized. Fine-tuning is performed for the target domain.</p>
Full article ">Figure 2
<p>Overall framework for CDCF. In the pre-training phase, the model comprises a feature extractor <math display="inline"><semantics> <msub> <mi>F</mi> <mi>ϕ</mi> </msub> </semantics></math> and a classifier <math display="inline"><semantics> <msub> <mi>G</mi> <mi>θ</mi> </msub> </semantics></math>, trained on the source domain dataset. In the fine-tuning stage, the feature extractor <math display="inline"><semantics> <msub> <mi>F</mi> <msup> <mi>ϕ</mi> <mo>′</mo> </msup> </msub> </semantics></math> initiates its parameters from <math display="inline"><semantics> <msub> <mi>F</mi> <mi>ϕ</mi> </msub> </semantics></math> and subsequently adapts to the novel domain by self-supervised learning in positive and negative sample pairs.</p>
Full article ">Figure 3
<p>Comparison between two contrastive learning methods.</p>
Full article ">Figure 4
<p>Base contrastive module. (<b>a</b>) Calculation of frequency band similarity for positive sample pairs. (<b>b</b>) Calculation of frequency band similarity for negative sample pairs.</p>
Full article ">Figure 5
<p>Diagram with three types of data augmentation.</p>
Full article ">Figure 6
<p>Performance comparison with state-of-the-art UATR models.</p>
Full article ">Figure 7
<p>Performance comparison of few-shot models in noisy environments.</p>
Full article ">
21 pages, 1518 KiB  
Article
A Lightweight Network Based on Multi-Scale Asymmetric Convolutional Neural Networks with Attention Mechanism for Ship-Radiated Noise Classification
by Chenhong Yan, Shefeng Yan, Tianyi Yao, Yang Yu, Guang Pan, Lu Liu, Mou Wang and Jisheng Bai
J. Mar. Sci. Eng. 2024, 12(1), 130; https://doi.org/10.3390/jmse12010130 - 9 Jan 2024
Cited by 1 | Viewed by 1523
Abstract
Ship-radiated noise classification is critical in ocean acoustics. Recently, the feature extraction method combined with time–frequency spectrograms and convolutional neural networks (CNNs) has effectively described the differences between various underwater targets. However, many existing CNNs are challenging to apply to embedded devices because [...] Read more.
Ship-radiated noise classification is critical in ocean acoustics. Recently, the feature extraction method combined with time–frequency spectrograms and convolutional neural networks (CNNs) has effectively described the differences between various underwater targets. However, many existing CNNs are challenging to apply to embedded devices because of their high computational costs. This paper introduces a lightweight network based on multi-scale asymmetric CNNs with an attention mechanism (MA-CNN-A) for ship-radiated noise classification. Specifically, according to the multi-resolution analysis relying on the relationship between multi-scale convolution kernels and feature maps, MA-CNN-A can autonomously extract more fine-grained multi-scale features from the time–frequency domain. Meanwhile, the MA-CNN-A maintains its light weight by employing asymmetric convolutions to balance accuracy and efficiency. The number of parameters introduced by the attention mechanism only accounts for 0.02‰ of the model parameters. Experiments on the DeepShip dataset demonstrate that the MA-CNN-A outperforms some state-of-the-art networks with a recognition accuracy of 98.2% and significantly decreases the parameters. Compared with the CNN based on three-scale square convolutions, our method has a 68.1% reduction in parameters with improved recognition accuracy. The results of ablation explorations prove that the improvements benefit from asymmetric convolution, multi-scale block, and attention mechanism. Additionally, MA-CNN-A shows a robust performance against various interferences. Full article
Show Figures

Figure 1

Figure 1
<p>The MA-CNN-A framework is structured as follows: MA-CNN-A is an end-to-end system, including audio pre-processing, feature extraction, classifier learning, and recognition stages. This automated system receives the ship-radiated noise and subsequently outputs recognition results. “GAP” represents global average pooling, and “FC” represents a fully connected layer.</p>
Full article ">Figure 2
<p>Detailed steps of feature extraction. Blue boxes represent the operations and green boxes represent the extracted two-dimensional features.</p>
Full article ">Figure 3
<p>Mel spectrograms of (<b>a</b>) cargo; (<b>b</b>) passenger ship; (<b>c</b>) oil tanker; and (<b>d</b>) tug radiated noises.</p>
Full article ">Figure 3 Cont.
<p>Mel spectrograms of (<b>a</b>) cargo; (<b>b</b>) passenger ship; (<b>c</b>) oil tanker; and (<b>d</b>) tug radiated noises.</p>
Full article ">Figure 4
<p>Comparison of square and asymmetric convolution.</p>
Full article ">Figure 5
<p>The structure of the multi-scale asymmetric block. Multi-scale features are extracted from the four branches, each employing four asymmetric convolutions.</p>
Full article ">Figure 6
<p>The structure of the SE module. <math display="inline"><semantics> <msub> <mi>F</mi> <mrow> <mi>t</mi> <mi>r</mi> </mrow> </msub> </semantics></math> represents the 2D convolution operation on the input feature. <math display="inline"><semantics> <msub> <mi>F</mi> <mrow> <mi>s</mi> <mi>q</mi> </mrow> </msub> </semantics></math> represents the squeeze operation, and <math display="inline"><semantics> <msub> <mi>F</mi> <mrow> <mi>s</mi> <mi>c</mi> <mi>a</mi> <mi>l</mi> <mi>e</mi> </mrow> </msub> </semantics></math> represents the operation of multiplying the weight values by <span class="html-italic">X</span> in scale.</p>
Full article ">Figure 7
<p>The weight value generation of ECA. The module generates channel weights using a rapid 1D convolution of size <span class="html-italic">k</span> from the aggregated features obtained through GAP. The convolutional size, <span class="html-italic">k</span>, is adaptively determined based on the mapping of channel dimension <span class="html-italic">C</span>.</p>
Full article ">Figure 8
<p>Structural diagram of the MA-CNN-A model. The deep blue block represents the CBA block, including the two-dimensional convolutional layer, the batch normalization layer, and the activation layer.</p>
Full article ">Figure 9
<p>The confusion matrices of the MA-CNN-A model.</p>
Full article ">Figure 10
<p>The t-SNE visualized graph: (<b>a</b>) The t-SNE visualized graph by Mel spectrogram; (<b>b</b>) The t-SNE visualized graph of output by MA-CNN-A.</p>
Full article ">Figure 11
<p>The figure of the change in accuracy of the validation set during training processing. The different colors represent MA-CNN-A, ResNet18, EfficientNetv2, MobileNetv2, ShuffleNetv2, Transformer, and CRNN, respectively.</p>
Full article ">Figure 12
<p>The experimental results of different attention mechanism. The X axis coordinate represents the number of the model parameters, and the Y axis represents the accuracy of the model. FLOPs is indicated by radii of circles.</p>
Full article ">Figure 13
<p>Test on low SNR. The different colors indicate MA-CNN-A, ResNet18, EfficientNetv2, MobileNetv2, ShuffleNetv2, Transformer, and CRNN, respectively.</p>
Full article ">Figure 14
<p>Experiments on the recognition system with other features. The Mel spectrograms, MFCC, and STFT spectrograms are fed into various models, respectively.</p>
Full article ">
15 pages, 3545 KiB  
Article
A Novel Underwater Acoustic Target Recognition Method Based on MFCC and RACNN
by Dali Liu, Hongyuan Yang, Weimin Hou and Baozhu Wang
Sensors 2024, 24(1), 273; https://doi.org/10.3390/s24010273 - 2 Jan 2024
Cited by 9 | Viewed by 2092
Abstract
In ocean remote sensing missions, recognizing an underwater acoustic target is a crucial technology for conducting marine biological surveys, ocean explorations, and other scientific activities that take place in water. The complex acoustic propagation characteristics present significant challenges for the recognition of underwater [...] Read more.
In ocean remote sensing missions, recognizing an underwater acoustic target is a crucial technology for conducting marine biological surveys, ocean explorations, and other scientific activities that take place in water. The complex acoustic propagation characteristics present significant challenges for the recognition of underwater acoustic targets (UATR). Methods such as extracting the DEMON spectrum of a signal and inputting it into an artificial neural network for recognition, and fusing the multidimensional features of a signal for recognition, have been proposed. However, there is still room for improvement in terms of noise immunity, improved computational performance, and reduced reliance on specialized knowledge. In this article, we propose the Residual Attentional Convolutional Neural Network (RACNN), a convolutional neural network that quickly and accurately recognize the type of ship-radiated noise. This network is capable of extracting internal features of Mel Frequency Cepstral Coefficients (MFCC) of the underwater ship-radiated noise. Experimental results demonstrate that the proposed model achieves an overall accuracy of 99.34% on the ShipsEar dataset, surpassing conventional recognition methods and other deep learning models. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Figure 1
<p>The extraction process of MFCC.</p>
Full article ">Figure 2
<p>A typical sample of underwater target MFCC feature.</p>
Full article ">Figure 3
<p>Structure of the proposed RACNN model.</p>
Full article ">Figure 4
<p>Results of FFT, Mel filter bank, and DCT of different radiated noise. (<b>a</b>) Natural noise in Class A. (<b>b</b>) Tugboat noise in Class B. (<b>c</b>) Motorboat noise in Class C. (<b>d</b>) Passengers noise in Class A. (<b>e</b>) Ocean liner noise in Class E.</p>
Full article ">Figure 4 Cont.
<p>Results of FFT, Mel filter bank, and DCT of different radiated noise. (<b>a</b>) Natural noise in Class A. (<b>b</b>) Tugboat noise in Class B. (<b>c</b>) Motorboat noise in Class C. (<b>d</b>) Passengers noise in Class A. (<b>e</b>) Ocean liner noise in Class E.</p>
Full article ">Figure 5
<p>Experimental process for different RACNN models.</p>
Full article ">Figure 6
<p>The training process of the proposed RACNN model.</p>
Full article ">Figure 7
<p>Confusion matrix of RACNN on the test dataset.</p>
Full article ">Figure 8
<p>Training process of different deep learning models.</p>
Full article ">
Back to TopTop