Auditory Inspired Convolutional Neural Networks for Ship Type Classification with Raw Hydrophone Data
<p>Auditory inspired convolutional neural network structure. In the time convolutional layer, four colors represent four groups of auditory filters with different center frequencies and impulse widths. In the permute layer and energy-pooling layer, decomposed signals are converted to frequency feature maps, each of which correspond to a frame. In frequency convolutional layers, convolution operations are implemented in both time and frequency axis. At the end of the network, several full connected layers and target layers are used to predict targets.</p> "> Figure 2
<p>Gammatone auditory filters. (<b>a</b>) four time domain filters with different center frequencies (CF); (<b>b</b>) frequency magnitude responses of all 128 filters; (<b>c</b>) relationship between center frequencies and bandwidths.</p> "> Figure 3
<p>Gamma envelopes of Gammatone filters. (<b>a</b>) gamma envelops of four filters; (<b>b</b>) magnitude of gamma envelopes of all 128 filters; (<b>c</b>) relationship between impulse widths and center frequencies.</p> "> Figure 4
<p>The length of the passenger ship is 139 m. The recording segment is 250 ms. During the recording period, the ship is 1.95 km away from the hydrophone and its navigational speed is 18.4 kn. Its radiated noise is convolved with each of three Gammatone filters. Their center frequencies are 49 Hz (orange line), 194 Hz (green line) and 432 Hz (blue line). Energy of each component is summed up to convert to a frequency domain.</p> "> Figure 5
<p>Spectrogram of hydrophone signal for each category. (<b>a</b>) background noise; (<b>b</b>) cargo; (<b>c</b>) passenger ship; (<b>d</b>) pleasure craft; (<b>e</b>) tanker; (<b>f</b>) tug.</p> "> Figure 6
<p>ROC curves of the classification results for all methods by assuming that one class was positive and other classes were negative. (<b>a</b>) waveform features; (<b>b</b>) wavelet features; (<b>c</b>) MFCC; (<b>d</b>) mel-frequency; (<b>e</b>) nonlinear auditory features; (<b>f</b>) spectral; (<b>g</b>) cepstral; (<b>h</b>) untrainable Gammatone initialed; (<b>i</b>) randomly initialed; (<b>j</b>) proposed method.</p> "> Figure 7
<p>The comparison of optimized Gammatone kernels and conventional Gammatone kernels. (<b>a</b>) filters in group 1; (<b>b</b>) filters in group 2; (<b>c</b>) filters in group 3; (<b>d</b>) filters in group 4.</p> "> Figure 8
<p>The center frequency-bandwidth distribution of optimized Gammatone kernels is plotted together with Gammatone filters.</p> "> Figure 9
<p>t-SNE feature visualization for features learned from proposed model and compared models. (<b>a</b>) waveform features; (<b>b</b>) wavelet features; (<b>c</b>) MFCC; (<b>d</b>) mel-frequency; (<b>e</b>) nonlinear auditory features; (<b>f</b>) spectral; (<b>g</b>) cepstral; (<b>h</b>) untrainable Gammatone initialed; (<b>i</b>) randomly initialed; (<b>j</b>) proposed.</p> ">
Abstract
:1. Introduction
2. Architecture of Auditory Inspired Convolutional Neural Network for Ship Type Classification
3. Learned Auditory Filter Banks for Ship Radiated Noise Modeling
3.1. Auditory Filter Banks and Time Convolutional Layer
3.2. Multi-Scale Convolutional Kernels
4. Auditory Cortex Inspired Discriminative Learning for Ship Type Classification
4.1. Permute Layer and Energy-Pooling Layer
4.2. Frequency Convolutional Layer and Target Layer
5. Experiments and Discussion
5.1. Experimental Dataset
5.2. Classification Experiments
5.3. Visualization and Analysis
5.3.1. Visualization and Analysis of Learned Filters
5.3.2. Feature Visualization and Cluster Analysis
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Meng, Q.; Yang, S.; Piao, S. The classification of underwater acoustic target signals based on wave structure and support vector machine. J. Acoust. Soc. Am. 2014, 136, 2265. [Google Scholar] [CrossRef]
- Meng, Q.; Yang, S. A wave structure based method for recognition of marine acoustic target signals. J. Acoust. Soc. Am. 2015, 137, 2242. [Google Scholar] [CrossRef]
- Wei, X.; Gang-Hu, L.I.; Wang, Z.Q. Underwater target recognition based on wavelet packet and principal component analysis. Comput. Simul. 2011, 28, 8–290. [Google Scholar]
- Siddagangaiah, S.; Li, Y.; Guo, X.; Chen, X.; Zhang, Q.; Yang, K.; Yang, Y. A complexity-based approach for the detection of weak signals in ocean ambient noise. Entropy 2016, 18, 101. [Google Scholar] [CrossRef]
- Das, A.; Kumar, A.; Bahl, R. Marine vessel classification based on passive sonar data: The cepstrum-based approach. Iet Radar Sonar Navig. 2013, 7, 87–93. [Google Scholar] [CrossRef]
- Santos-Domínguez, D.; Torres-Guijarro, S.; Cardenal-López, A.; Pena-Gimenez, A. ShipsEar: An underwater vessel noise database. Appl. Acoust. 2016, 113, 64–69. [Google Scholar] [CrossRef]
- Yang, L.; Chen, K.; Zhang, B.; Liang, Y. Underwater acoustic target classification and auditory feature identification based on dissimilarity evaluation. Acta Phys. Sin. 2014, 63, 134304. [Google Scholar]
- Zhang, L.; Wu, D.; Han, X.; Zhu, Z. Feature extraction of underwater target signal using Mel frequency cepstrum coefficients based on acoustic vector sensor. J. Sens. 2016, 2016, 7864213. [Google Scholar] [CrossRef]
- Smith, E.C.; Lewicki, M.S. Efficient auditory coding. Nature 2006, 439, 978–982. [Google Scholar] [CrossRef] [PubMed]
- Yang, H.; Gan, A.; Chen, H.; Pan, Y. Underwater acoustic target recognition using SVM ensemble via weighted sample and feature selection. In Proceedings of the International Bhurban Conference on Applied Sciences and Technology, Islamabad, Pakistan, 12–16 January 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 522–527. [Google Scholar]
- Filho, W.S.; Seixas, J.M.D.; Moura, N.N.D. Preprocessing passive sonar signals for neural classification. IET Radar Sonar Navig. 2011, 5, 605–612. [Google Scholar] [CrossRef]
- Chen, C.H.; Lee, J.D.; Lin, M.C. Classification of underwater signals using neural networks. Tamkang J. Sci. Eng. 2000, 3, 31–48. [Google Scholar]
- Damianos, K.; Jan, S.; Richard, S.; William, H.; John, M. Individual Ship Detection Using Underwater Acoustics. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Calgary, AB, Canada, 15–20 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2121–2125. [Google Scholar]
- Damianos, K.; William, H.; Richard, S.; John, M.; Stavros, T.; Edin, I.; George, S. Applying speech technology to the ship-type classification problem. In Proceedings of the OCEANS 2017, Anchorage, AK, USA, 18–21 September 2017; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
- Filho, J.B.O.S.; Seixas, J.M.D. Class-modular multi-layer perceptron networks for supporting passive sonar signal classification. IET Radar Sonar Navig. 2016, 10, 311–317. [Google Scholar] [CrossRef]
- Sainath, T.N.; Kingsbury, B.; Mohamed, A.R.; Ramabhadran, B. Learning filter banks within a deep neural network framework. In Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Olomouc, Czech Republic, 8–12 December 2013; IEEE: Piscataway, NJ, USA, 2014; pp. 297–302. [Google Scholar]
- Kamal, S.; Mohammed, S.K.; Pillai, P.R.S.; Supriya, M.H. Deep learning architectures for underwater target recognition. In Proceedings of the 2013 International Symposium on Ocean Electronics, Kochi, India, 23–25 October 2013; IEEE: Piscataway, NJ, USA, 2014; pp. 48–54. [Google Scholar]
- Cao, X.; Zhang, X.; Yu, Y.; Niu, L. Deep learning-based recognition of underwater target. In Proceedings of the IEEE International Conference on Digital Signal Processing, Beijing, China, 16–18 October 2017; IEEE: Piscataway, NJ, USA, 2018; pp. 89–93. [Google Scholar]
- Yang, H.; Shen, S.; Yao, X.; Sheng, M.; Wang, C. Competitive deep-belief networks for underwater acoustic target recognition. Sensors 2018, 18, 952. [Google Scholar] [CrossRef]
- Shen, S.; Yang, H.; Sheng, M. Compression of a deep competitive network based on mutual information for underwater acoustic targets recognition. Entropy 2018, 20, 243. [Google Scholar] [CrossRef]
- Mu, L.; Peng, Y.; Qiu, M.; Yang, X.; Hu, C.; Zhang, F. Study on modulation spectrum feature extraction of ship radiated noise based on auditory model. In Proceedings of the 2016 IEEE/OES China Ocean Acoustics, Harbin, China, 9–11 January 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–5. [Google Scholar]
- Ghosh, J.; Deuser, L.; Beck, S.D. A neural network based hybrid system for detection, characterization, and classification of short-duration oceanic signals. IEEE J. Ocean. Eng. 2002, 17, 351–363. [Google Scholar] [CrossRef]
- Zwicker, E.; Fastl, H.; Hartmann, W.M. Psychoacoustics: Facts and models. Phys. Today 2001, 54, 64–65. [Google Scholar]
- Moore, B.C.J. Cochlear Hearing Loss: Physiological, Psychological and Technical Issues, 2nd ed.; John Wiley and Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
- Gelf, S.A. Hearing: An Introduction to Psychological and Physiological Acoustics; CRC Press: Boca Raton, FL, USA, 2009; p. 320. [Google Scholar]
- Slaney, M. An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank; Apple Computer: Cupertino, CA, USA, 1993. [Google Scholar]
- Glasberg, B.R.; Moore, B.C. Derivation of auditory filter shapes from notched-noise data. Hear. Res. 1990, 47, 103–138. [Google Scholar] [CrossRef]
- Arora, S.; Bhaskara, A.; Ge, R.; Ma, T. Provable bounds for learning some deep representations. In Proceedings of the International Conference on Machine Learning, Beijing, China, 21–26 June 2014; JMLR, Inc.: Cambridge, MA, USA; pp. 584–592. [Google Scholar]
- Smith, E.C.; Lewicki, M.S. Efficient coding of time-relative structure using spikes. Neural Comput. 2005, 17, 19–45. [Google Scholar] [CrossRef]
- Chechik, G.; Nelken, I. Auditory abstraction from spectro-temporal features to coding auditory entities. Proc. Natl. Acad. Sci. USA 2012, 109, 18968–18973. [Google Scholar] [CrossRef] [Green Version]
- Decharms, R.C.; Blake, D.T.; Merzenich, M.M. Optimizing sound features for cortical neurons. Science 1998, 280, 1439–1443. [Google Scholar] [CrossRef]
- Baqar, M.; Zaidi, S.S.H. Performance evaluation of linear and multi-linear subspace learning techniques for object classification based on underwater acoustics. In Proceedings of the International Bhurban Conference on Applied Sciences and Technology, Islamabad, Pakistan, 10–14 January 2017; IEEE: Piscataway, NJ, USA; pp. 675–683. [Google Scholar]
- Meddis, R.; Lopez-Poveda, E.A. Auditory Periphery: From Pinna to Auditory Nerve; Springer: Boston, MA, USA, 2010; pp. 7–38. [Google Scholar]
- Mckenna, M.F.; Ross, D.; Wiggins, S.M.; Hildebrand, J.A. Underwater radiated noise from modern commercial ships. J. Acoust. Soc. Am. 2012, 131, 92–103. [Google Scholar] [CrossRef] [PubMed]
- Hinton, G.E. Visualizing high-dimensional data using t-SNE. Vigiliae Christianae 2008, 9, 2579–2605. [Google Scholar]
CNNs Trained on Hand Designed Features | CNNs with Same Structure as Proposed Model | Proposed Model |
---|---|---|
Extracted features: waveform, wavelet, MFCC, Mel-frequency, nonlinear auditory filter, spectral and cepstral. | 1 multi-scale time convolutional layer with 128 kernels initialed randomly or initialed with untrainable Gammatone | 1 multi-scale time convolutional layer with 128 kernels initialed with Gammatone |
1 permute layer | ||
1 energy pooling layer | ||
3 convolutional layers with 32 kernels for each layer | ||
3 full connected layers with 32 units for each layer | ||
1 target layer with 6 units |
Parameters | Values |
---|---|
Learning rate | 0.0001 |
Batchsize | 50 |
Epochs | 84 |
Optimizer | RMSprop |
Input | Features/Methods | Input Dimension | Convolutional Kernel Width | Accuracy |
---|---|---|---|---|
Hand designed features | Waveform [1,2] | 5 | 0.574 | |
Wavelet [3] | 5 | 0.679 | ||
MFCC [8] | 5 | 0.576 | ||
Mel-frequency | 5 | 0.685 | ||
Nonlinear auditory | 5 | 0.726 | ||
Spectral [17,18] | 100 | 0.732 | ||
Cepstral [5,6] | 50 | 0.712 | ||
Raw time domain data | Untrainable Gammatone | [100,200,400,800] | 0.608 | |
Randomly initialed | [100,200,400,800] | 0.753 | ||
Proposed model | [100,200,400,800] | 0.792 |
Class | Precision | Recall | f1-Score |
---|---|---|---|
Background noise | 0.73 | 0.94 | 0.82 |
Cargo | 0.96 | 0.79 | 0.87 |
Passenger ship | 0.82 | 0.91 | 0.86 |
Pleasure craft | 0.82 | 0.72 | 0.77 |
Tanker | 0.73 | 0.86 | 0.79 |
Tug | 0.72 | 0.54 | 0.62 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shen, S.; Yang, H.; Li, J.; Xu, G.; Sheng, M. Auditory Inspired Convolutional Neural Networks for Ship Type Classification with Raw Hydrophone Data. Entropy 2018, 20, 990. https://doi.org/10.3390/e20120990
Shen S, Yang H, Li J, Xu G, Sheng M. Auditory Inspired Convolutional Neural Networks for Ship Type Classification with Raw Hydrophone Data. Entropy. 2018; 20(12):990. https://doi.org/10.3390/e20120990
Chicago/Turabian StyleShen, Sheng, Honghui Yang, Junhao Li, Guanghui Xu, and Meiping Sheng. 2018. "Auditory Inspired Convolutional Neural Networks for Ship Type Classification with Raw Hydrophone Data" Entropy 20, no. 12: 990. https://doi.org/10.3390/e20120990
APA StyleShen, S., Yang, H., Li, J., Xu, G., & Sheng, M. (2018). Auditory Inspired Convolutional Neural Networks for Ship Type Classification with Raw Hydrophone Data. Entropy, 20(12), 990. https://doi.org/10.3390/e20120990