Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space
<p>Images for two wake events (one per row) are displayed in this figure. The first column on the left is the C-band images, followed by the S-band and X-band images. The final column to the right is the wake masks used for the U-Net segmentation model. Note that there is no persistent wake for the images in the second row, and therefore there is no mask. All images with a wake event have the same augmentation applied to them for consistency in the augmented dataset.</p> "> Figure 2
<p>Performance of models trained with no augmentations tested on all rotation augmentations sets. Models are measured using the PR AUC metric, and we see models drop to their lowest performance around the 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotation.</p> "> Figure 3
<p>S-band latent space with labels for the rotation augmentation angle of each cluster. These labels help to show that clusters are forming based on the difference of the rotation angle from 0° or 90°, revealing inadvertent clustering based on the rotation of a square image.</p> "> Figure 4
<p>Circular cropping is applied to the images so that rotation augmentations do not create triangular clippings that are seen in <a href="#make-04-00031-f001" class="html-fig">Figure 1</a>. Any time a rotation is applied to these images, there is no change in the amount or shape of overall information in the image—only the orientation of the wake. Note that there is no persistent wake for the images in the second row, and therefore there is no mask.</p> "> Figure 5
<p>C-band only 2D UMAP latent space using circular crop images. Each augmentation set has its mean location pin-pointed with a black dot and connected via a black line with arrows to show the direction of the increasing rotation angle.</p> "> Figure 6
<p>S-band only 2D UMAP latent space using circular crop images. All rotation augmentations are used, and the results show two clusters of images based on the look angle, 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> in the upper right and 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> are in the lower left.</p> "> Figure 7
<p>X-band only 2D UMAP latent space using circular crop images. All rotation augmentations are used, and the results show two clusters of images based on the look angle, 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> in the upper left and 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> are in the lower right. The 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle images form a pattern similar to C-band (<a href="#make-04-00031-f005" class="html-fig">Figure 5</a>), while the 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle images form a pattern similar to S-band (<a href="#make-04-00031-f006" class="html-fig">Figure 6</a>).</p> "> Figure 8
<p>Comparison of the same wake event in the X-Band. On the left is the wake with a look angle of 0°, and on the right is the wake with a look angle of 90°. All other generation parameters are the same for these images. The noise level in the background ocean surface in the left is typical of 0° and 180° look angles, while the one on the right is typical of look angles of 90° and 270°.</p> "> Figure 9
<p>Two-dimensional UMAP latent space generated using all rotation-augmented circular crop images. C-band and the 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle X-band images cluster together in the lower right of the image. S-band and the 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> X-band images all cluster apart.</p> "> Figure 10
<p>Distribution of the metadata features for the four training and testing folds used in the baseline study. The folds are designed to stratify <tt>contains_wake</tt>, <tt>look_angle</tt>, and <tt>run_name</tt> to have matching distributions for each fold.</p> "> Figure 11
<p>Results for the baseline latent space study. This study is run with four sets of the data using a combination of non-rotated or randomly-rotated images and either normal crop image or circular crop images. All sets have matching results, so only one set is presented. The models trained on C-band performed perfectly for C-band and the 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle X-band images, while it did poorly for S-band and the 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> and 180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle X-band images. These results confirm the latent space representation of the data has a relation to the model’s performance on that data. The results also convey that latent space is useful regardless of image cropping or applied augmentations. However, cropping and augmentations are consistent between the training and testing images—the only difference is the SAR band.</p> "> Figure 12
<p>Performance study results for C-band using the unet classifier architecture. Each model is trained with an even split of augmented and non-augmented images, then tested on 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotated images (<b>top row</b>) and 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotated images (<b>bottom row</b>). We use the MCC metric and a sweep of thresholds to present a profile of performance rather than single metric. Training augmentations are chosen from different latent space representations; moving left to right, they are 2D, 5D, 10D, a combined selection using all three, and then finally the baseline performance is on the far right for comparison.</p> "> Figure 13
<p>Performance study results for S-band using the unet classifier architecture. Each model is trained with an even split of augmented and non-augmented images, then tested on 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotated images (<b>top row</b>) and 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotated images (<b>bottom row</b>). We use the MCC metric and a sweep of thresholds to present a profile of performance rather than single metric. Training augmentations are chosen from the 2D latent space representations. Set 1 chooses based on the 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>/270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle images, and Set 2 chooses based on the 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>/180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle images. The baseline model performance is on the far right for comparison.</p> "> Figure 14
<p>Performance study results for X-band using the unet classifier architecture. Each model is trained with an even split of augmented and non-augmented images, then tested on 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotated images (<b>top row</b>) and 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> rotated images (<b>bottom row</b>). We use the MCC metric and a sweep of thresholds to present a profile of performance rather than single metric. Training augmentations are chosen from the 2D latent space representations. Set 1 chooses based on the 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>/270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle images, and Set 2 chooses based on the 0<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>/180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math> look angle images. The baseline model performance is on the far right for comparison.</p> ">
Abstract
:1. Introduction
2. Background
2.1. What Are Data Augmentations?
2.2. Dimensionality Reduction Techniques
2.3. Ship Wake SAR Imagery Dataset
2.4. Deep Learning Models for Ship Wake Detection
3. Methods
3.1. How We Augmented the Data
3.2. Performing a Sensitivity Analysis
3.3. Creating the Latent Space
3.3.1. A Brief Note of Caution
3.3.2. Effect of Augmentations on Latent Space Representation
4. Understanding Model Performance and Latent Space
5. Using Latent Space to Improve Model Performance
5.1. Selecting Augmented Sets in Latent Space
5.2. C-Band Performance Results
5.3. S- and X-Band Results
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ASC | Attributed Scattering Centers |
ATR | Automatic Target Recognition |
CAD | Computer Aided Design |
CFD | Computational Fluid Dynamics |
CNN | Convolutional Neural Network |
CST | Computer Simulation Technology |
csv | Comma-Separated Values |
EO | Electro-Optical |
ERIM | Environmental Research Institute of Michigan |
GAN | Generative Adversarial Network |
MCC | Matthews Correlation Coefficient |
MDPI | Multidisciplinary Digital Publishing Institute |
MRI | Magnetic Resonance Imaging |
PCA | Principal Component Analysis |
PR | Precision-Recall (Curve) |
PR AUC | Area Under the Precision-Recall Curve |
ROC | Receiver Operating Characteristic |
SAR | Synthetic Aperture Radar |
t-SNE | T-distributed Stochastic Neighbor Embedding |
UMAP | Uniform Manifold Approximation and Projection |
UUID | Universally Unique Identifier |
References
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: http://www.deeplearningbook.org (accessed on 19 August 2021).
- Perez, L.; Wang, J. The effectiveness of data augmentation in image classification using deep learning. arXiv 2017, arXiv:1712.04621. [Google Scholar]
- Tran, T.; Pham, T.; Carneiro, G.; Palmer, L.; Reid, I. A bayesian data augmentation approach for learning deep models. arXiv 2017, arXiv:1710.10564. [Google Scholar]
- Fabian, Z.; Heckel, R.; Soltanolkotabi, M. Data augmentation for deep learning based accelerated MRI reconstruction with limited data. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; PMLR: London, UK, 2021; pp. 3057–3067. [Google Scholar]
- Oveis, A.H.; Guisti, E.; Ghio, S.; Martorella, M. A Survey on the Applications of Convolutional Neural Networks for Synthetic Aperture Radar: Recent Advances. IEEE Aerosp. Electron. Syst. Mag. 2021, 37, 18–42. [Google Scholar] [CrossRef]
- Ding, J.; Chen, B.; Liu, H.; Huang, M. Convolutional neural network with data augmentation for SAR target recognition. IEEE Geosci. Remote. Sens. Lett. 2016, 13, 364–368. [Google Scholar] [CrossRef]
- Kwak, Y.; Song, W.J.; Kim, S.E. Speckle-noise-invariant convolutional neural network for SAR target recognition. IEEE Geosci. Remote. Sens. Lett. 2018, 16, 549–553. [Google Scholar] [CrossRef]
- Du, K.; Deng, Y.; Wang, R.; Zhao, T.; Li, N. SAR ATR based on displacement-and rotation-insensitive CNN. Remote Sens. Lett. 2016, 7, 895–904. [Google Scholar] [CrossRef]
- Lv, J.; Liu, Y. Data augmentation based on attributed scattering centers to train robust CNN for SAR ATR. IEEE Access 2019, 7, 25459–25473. [Google Scholar] [CrossRef]
- Malmgren-Hansen, D.; Kusk, A.; Dall, J.; Nielsen, A.A.; Engholm, R.; Skriver, H. Improving SAR automatic target recognition models with transfer learning from simulated data. IEEE Geosci. Remote. Sens. Lett. 2017, 14, 1484–1488. [Google Scholar] [CrossRef] [Green Version]
- Wang, K.; Zhang, G.; Leung, H. SAR target recognition based on cross-domain and cross-task transfer learning. IEEE Access 2019, 7, 153391–153399. [Google Scholar] [CrossRef]
- Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. Math. Phys. Eng. Sci. 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
- Becht, E.; McInnes, L.; Healy, J.; Dutertre, C.A.; Kwok, I.W.; Ng, L.G.; Ginhoux, F.; Newell, E.W. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 2019, 37, 38–44. [Google Scholar] [CrossRef] [PubMed]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- McInnes, L.; Healy, J.; Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
- Higgins, E.; Sobien, D.; Freeman, L.; Pitt, J.S. Data Fusion for Combining Information for Disparate Data Sources for Maritime Remote Sensing. In Proceedings of the AIAA Scitech 2021 Forum, American Institute of Aeronautics and Astronautics, Virtual, 11–15 January 2021. [Google Scholar] [CrossRef]
- Lyzenga, D.R.; Bennett, J.R. Full-spectrum modeling of synthetic aperture radar internal wave signatures. J. Geophys. Res. 1988, 93, 12345. [Google Scholar] [CrossRef]
- Wang, Y.; Yang, M.; Chong, J. Simulation and Analysis of SAR Images of Oceanic Shear-Wave-Generated Eddies. Sensors 2019, 19, 1529. [Google Scholar] [CrossRef] [Green Version]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Usuyama, N. PyTorch-UNet. 2018. Available online: https://github.com/usuyama/pytorch-unet (accessed on 3 November 2020).
- Higgins, E.; Sobien, D.; Freeman, L.; Pitt, J.S. Ship Wake Detection Using Data Fusion in Multi-sensor Remote Sensing Applications. In Proceedings of the AIAA SCITECH 2022 Forum, San Diego, CA, USA, 3–7 January 2022; p. 0997. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8026–8037. [Google Scholar]
- Lei Ba, J.; Swersky, K.; Fidler, S. Predicting deep zero-shot convolutional neural networks using textual descriptions. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 4247–4255. [Google Scholar]
- Davis, J.; Goadrich, M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 233–240. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- McInnes, L.; Healy, J.; Saul, N.; Grossberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 2018, 3, 861. [Google Scholar] [CrossRef]
- McInnes, L. umap Documentation: Release 0.5. Available online: https://umap-learn.readthedocs.io/en/latest/release_notes.html (accessed on 12 May 2022).
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [Green Version]
- McLachlan, G.J. Mahalanobis distance. Resonance 1999, 4, 20–26. [Google Scholar] [CrossRef]
Parameter | Possible Values |
---|---|
SAR band | C, S, X |
Look angle | 0, 90, 180, 270 |
Polarization | VV, HH |
Inclination angle | 30, 40, 50, 60 |
Rotation Angle | Mean UMAP Component 1 | Mean UMAP Component 2 |
---|---|---|
0 | 11.79 | −3.82 |
15 | 11.69 | −3.72 |
30 | 11.21 | −3.09 |
45 | 13.51 | −4.12 |
60 | 12.99 | −5.27 |
75 | 11.84 | −5.68 |
90 | 10.54 | −5.46 |
105 | 9.81 | −4.27 |
120 | 14.20 | −2.72 |
135 | 10.61 | −1.62 |
150 | 12.30 | −2.56 |
165 | 12.09 | −3.73 |
180 | 12.33 | −3.86 |
Dim. | 0 | 15 | 30 | 45 | 60 | 75 | 90 | 105 | 120 | 135 | 150 | 165 | 180 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2D | 1.72 | 1.75 | 2.10 | 2.69 | 2.06 | 1.12 | - | 1.22 | 3.78 | 3.32 | 2.85 | 1.93 | 1.99 | 75 |
5D | 2.83 | 2.52 | 4.24 | 3.12 | 3.89 | 4.41 | - | 4.55 | 3.65 | 3.39 | 4.07 | 2.71 | 2.72 | 15 |
10D | 3.53 | 4.31 | 4.92 | 4.99 | 4.99 | 4.99 | - | 5.00 | 5.00 | 5.00 | 5.00 | 4.48 | 4.94 | 0 |
All | 8.09 | 8.58 | 11.27 | 10.80 | 10.94 | 10.51 | - | 10.76 | 12.43 | 11.70 | 11.91 | 9.13 | 9.65 | 0 |
Band | UMAP Dimensions | Train Augmentations | Test Augmentations |
---|---|---|---|
C | 2D | 0, 75 | 0, 90 |
C | 5D | 0, 15 | 0, 90 |
C | 10D | 0, 15 | 0, 90 |
C | All | 0, 15 | 0, 90 |
Dimensions | Training Rotation | Testing Rotation | MCC | 95% CI | Threshold |
---|---|---|---|---|---|
2 | 0, 75 | 0 | 1.0 | N/A | 0.9 |
5 | 0, 15 | 0 | 1.0 | N/A | 0.7 |
10 | 0, 15 | 0 | 1.0 | N/A | 0.6 |
All | 0, 15 | 0 | 1.0 | N/A | 0.6 |
Baseline | 0 | 0 | 1.0 | N/A | 0.5 |
2 | 0, 75 | 90 | 0.93 | 0.069 | 0.7 |
5 | 0, 15 | 90 | 0.53 | 0.17 | 0.5 |
10 | 0, 15 | 90 | 0.48 | 0.16 | 0.5 |
All | 0, 15 | 90 | 0.58 | 0.17 | 0.5 |
Baseline | 0 | 90 | 0.49 | 0.19 | 0.5 |
Band | Set | UMAP Dimensions | Train Augmentations | Test Augmentations | Look Angle |
---|---|---|---|---|---|
S | 1 | 2D | 0, 75 | 0, 90 | 90/270 |
S | 2 | 2D | 0, 180 | 0, 90 | 0/180 |
X | 1 | 2D | 0, 75 | 0, 90 | 90/270 |
X | 2 | 2D | 0, 180 | 0, 90 | 0/180 |
Set | Training Rotation | Testing Rotation | MCC | 95% CI | Threshold |
---|---|---|---|---|---|
1 | 0, 75 | 0 | 0.95 | 0.054 | 0.7 |
2 | 0, 180 | 0 | 0.99 | 0.011 | 0.6 |
Baseline | 0 | 0 | 0.97 | 0.056 | 0.4 |
1 | 0, 75 | 90 | 0.66 | 0.092 | 0.5 |
2 | 0, 180 | 90 | 0.010 | 0.021 | 0.6 |
Baseline | 0 | 90 | 0.0057 | 0.012 | 0.5 |
Set | Training Rotation | Testing Rotation | MCC | 95% CI | Threshold |
---|---|---|---|---|---|
1 | 0, 75 | 0 | 1.0 | N/A | 0.8 |
2 | 0, 180 | 0 | 1.0 | N/A | 0.4 |
Baseline | 0 | 0 | 1.0 | N/A | 0.5 |
1 | 0, 75 | 90 | 0.82 | 0.11 | 0.5 |
2 | 0, 180 | 90 | 0.048 | 0.071 | 0.3 |
Baseline | 0 | 90 | 0.093 | 0.13 | 0.3 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sobien, D.; Higgins, E.; Krometis, J.; Kauffman, J.; Freeman, L. Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space. Mach. Learn. Knowl. Extr. 2022, 4, 665-687. https://doi.org/10.3390/make4030031
Sobien D, Higgins E, Krometis J, Kauffman J, Freeman L. Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space. Machine Learning and Knowledge Extraction. 2022; 4(3):665-687. https://doi.org/10.3390/make4030031
Chicago/Turabian StyleSobien, Daniel, Erik Higgins, Justin Krometis, Justin Kauffman, and Laura Freeman. 2022. "Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space" Machine Learning and Knowledge Extraction 4, no. 3: 665-687. https://doi.org/10.3390/make4030031
APA StyleSobien, D., Higgins, E., Krometis, J., Kauffman, J., & Freeman, L. (2022). Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space. Machine Learning and Knowledge Extraction, 4(3), 665-687. https://doi.org/10.3390/make4030031