[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
A Novel Pre-Processing Approach and Benchmarking Analysis for Faster, Robust, and Improved Small Object Detection Methods
Next Article in Special Issue
Dual-Feature Fusion Learning: An Acoustic Signal Recognition Method for Marine Mammals
Previous Article in Journal
Discriminating between Biotic and Abiotic Stress in Poplar Forests Using Hyperspectral and LiDAR Data
Previous Article in Special Issue
Source Range Estimation Using Linear Frequency-Difference Matched Field Processing in a Shallow Water Waveguide
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gridless DOA Estimation Method for Arbitrary Array Geometries Based on Complex-Valued Deep Neural Networks

School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(19), 3752; https://doi.org/10.3390/rs16193752
Submission received: 11 September 2024 / Revised: 6 October 2024 / Accepted: 7 October 2024 / Published: 9 October 2024
(This article belongs to the Special Issue Ocean Remote Sensing Based on Radar, Sonar and Optical Techniques)
Figure 1
<p>(<b>a</b>) Magnitude of Fourier coefficients at various orders. (<b>b</b>) Relationship between array steering vector error and <span class="html-italic">N</span>.</p> ">
Figure 2
<p>Angular-domain covariance matrix reconstruction network architecture.</p> ">
Figure 3
<p>Training loss, validation loss, and learning rate variation with epochs for the following: (<b>a</b>) <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>4</mn> </mrow> </semantics></math>. (<b>b</b>) <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>7</mn> </mrow> </semantics></math>. (<b>c</b>) <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>10</mn> </mrow> </semantics></math>.</p> ">
Figure 4
<p>DOA estimation performance of CDNNs with truncation orders <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>4</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>7</mn> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <mi>N</mi> <mo>=</mo> <mn>10</mn> </mrow> </semantics></math> at different target angular separations. (<b>a</b>) RMSE and (<b>b</b>) RP for <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">θ</mi> <mo>=</mo> <mfenced separators="" open="[" close="]"> <mrow> <msup> <mrow> <mn>85</mn> </mrow> <mo>∘</mo> </msup> <mo>,</mo> <msup> <mrow> <mn>95</mn> </mrow> <mo>∘</mo> </msup> </mrow> </mfenced> </mrow> </semantics></math>. (<b>c</b>) RMSE and (<b>d</b>) RP for <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">θ</mi> <mo>=</mo> <mfenced separators="" open="[" close="]"> <mrow> <msup> <mrow> <mn>80</mn> </mrow> <mo>∘</mo> </msup> <mo>,</mo> <msup> <mrow> <mn>100</mn> </mrow> <mo>∘</mo> </msup> </mrow> </mfenced> </mrow> </semantics></math>. (<b>e</b>) RMSE and (<b>f</b>) RP for <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">θ</mi> <mo>=</mo> <mfenced separators="" open="[" close="]"> <mrow> <msup> <mrow> <mn>70</mn> </mrow> <mo>∘</mo> </msup> <mo>,</mo> <msup> <mrow> <mn>110</mn> </mrow> <mo>∘</mo> </msup> </mrow> </mfenced> </mrow> </semantics></math>.</p> ">
Figure 5
<p>DOA estimation results. The proposed method, source numbers (<b>a</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>b</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>c</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. MUSIC, source numbers (<b>d</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>e</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>f</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. SPICE, source numbers (<b>g</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>h</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>i</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. SBL, source numbers (<b>j</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>k</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>l</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>.</p> ">
Figure 5 Cont.
<p>DOA estimation results. The proposed method, source numbers (<b>a</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>b</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>c</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. MUSIC, source numbers (<b>d</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>e</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>f</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. SPICE, source numbers (<b>g</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>h</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>i</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>. SBL, source numbers (<b>j</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>, (<b>k</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics></math>, and (<b>l</b>) <math display="inline"><semantics> <mrow> <mi>K</mi> <mo>=</mo> <mn>3</mn> </mrow> </semantics></math>.</p> ">
Figure 6
<p>Relationship between SNR and both RMSE and RP under spatio-temporal Gaussian white noise conditions.(<b>a</b>) RMSE and (<b>b</b>) RP for <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">θ</mi> <mo>=</mo> <mfenced separators="" open="[" close="]"> <mrow> <msup> <mrow> <mn>80</mn> </mrow> <mo>∘</mo> </msup> <mo>,</mo> <msup> <mrow> <mn>100</mn> </mrow> <mo>∘</mo> </msup> </mrow> </mfenced> </mrow> </semantics></math>. (<b>c</b>) RMSE and (<b>d</b>) RP for <math display="inline"><semantics> <mrow> <mi mathvariant="bold-italic">θ</mi> <mo>=</mo> <mfenced separators="" open="[" close="]"> <mrow> <msup> <mrow> <mn>70</mn> </mrow> <mo>∘</mo> </msup> <mo>,</mo> <msup> <mrow> <mn>110</mn> </mrow> <mo>∘</mo> </msup> </mrow> </mfenced> </mrow> </semantics></math>.</p> ">
Figure 7
<p>Relationship between RP and RMSE with respect to angle separation. (<b>a</b>) RMSE. (<b>b</b>) RP.</p> ">
Figure 8
<p>Algorithm performance under different snapshot conditions. (<b>a</b>) RMSE. (<b>b</b>) RP.</p> ">
Figure 9
<p>Schematic of the Swellex-96 Event S59 experiment scenario [<a href="#B38-remotesensing-16-03752" class="html-bibr">38</a>].</p> ">
Figure 10
<p>BTR results using different methods. (<b>a</b>) GPS. (<b>b</b>) CBF. (<b>c</b>) Proposed method. (<b>d</b>) MUSIC. (<b>e</b>) SPICE. (<b>f</b>) SBL.</p> ">
Figure 10 Cont.
<p>BTR results using different methods. (<b>a</b>) GPS. (<b>b</b>) CBF. (<b>c</b>) Proposed method. (<b>d</b>) MUSIC. (<b>e</b>) SPICE. (<b>f</b>) SBL.</p> ">
Figure 11
<p>Processing results of Swellex-96 Event S59 data using different methods. (<b>a</b>) RP. (<b>b</b>) RMSE. (<b>c</b>) CPU time.</p> ">
Versions Notes

Abstract

:
Gridless direction of arrival (DOA) estimation methods have garnered significant attention due to their ability to avoid grid mismatch errors, which can adversely affect the performance of high-resolution DOA estimation algorithms. However, most existing gridless methods are primarily restricted to applications involving uniform linear arrays or sparse linear arrays. In this paper, we derive the relationship between the element-domain covariance matrix and the angular-domain covariance matrix for arbitrary array geometries by expanding the steering vector using a Fourier series. Then, a deep neural network is designed to reconstruct the angular-domain covariance matrix from the sample covariance matrix and the gridless DOA estimation can be obtained by Root-MUSIC. Simulation results on arbitrary array geometries demonstrate that the proposed method outperforms existing methods like MUSIC, SPICE, and SBL in terms of resolution probability and DOA estimation accuracy, especially when the angular separation between targets is small. Additionally, the proposed method does not require any hyperparameter tuning, is robust to varying snapshot numbers, and has a lower computational complexity. Finally, real hydrophone data from the SWellEx-96 ocean experiment validates the effectiveness of the proposed method in practical underwater acoustic environments.

1. Introduction

Direction of arrival (DOA) estimation is one of the most fundamental problems in array signal processing, which obtains azimuthal information about targets by sampling the signal in the spatial domain [1,2]. Conventional DOA estimation methods are mainly based on beamforming, subspace decomposition, or sparse reconstruction. Conventional beamforming (CBF) is the simplest beamforming technique and has been widely used in practice due to its very low complexity and high robustness. However, the resolution of CBF can only be enhanced by expanding the array aperture and cannot be improved by increasing the number of snapshots or the signal-to-noise ratio (SNR) [3]. To enhance the performance of CBF when the array aperture is limited, superdirectivity beamforming methods, such as Capon, have been proposed. These methods greatly enhance the resolution of small-aperture arrays, but at the cost of increased sensitivity to errors, which affects robustness [4]. Subspace decomposition is another classical class of DOA estimation methods, of which the MUSIC [5] and ESPRIT [6] algorithms are the most representative. The advantage of such methods is their excellent performance when the number of snapshots is sufficient. However, if the snapshot number is insufficient, the signal covariance matrix cannot be accurately estimated, leading to a significant decline in algorithm performance [7].
Subsequently, sparse reconstruction DOA estimation methods have been proposed based on signal sparsity. These methods are generally divided into three categories based on the signal model: on-grid, off-grid, and gridless approaches [8]. On-grid methods discretize the continuous angular domain into a finite set of points, assuming that the actual DOAs coincide with these discrete points [9,10,11,12]. While these methods enable the direct application of compressed sensing techniques, they are susceptible to grid mismatch errors. Off-grid methods, though they estimate DOAs within the continuous angular domain, still rely on the discretized grid model [13,14,15,16]. To alleviate grid mismatch, these methods employ strategies such as first-order Taylor approximations of the steering vector near the grid [14] or adaptive grid adjustments [15]. However, their performance is contingent on the accuracy of these approximations, and they often introduce non-convex optimization problems, complicating the solution process. In contrast, gridless methods perform DOA estimation directly in the continuous angular domain without grid discretization. These approaches typically rely on techniques like Toeplitz covariance matrix reconstruction or trigonometric polynomial rooting [8,17]. Initially, gridless methods were limited to uniform linear arrays (ULAs) and sparse linear arrays (SLAs), but they have since been extended to coprime arrays through interpolation techniques [18]. Recent developments have further broadened their applicability to arbitrary array geometries by representing the steering vector as a Fourier series [19,20,21,22,23]. Despite these advancements, gridless methods often require semi-definite programming, which can result in high computational complexity. Additionally, in methods that use convex optimization to solve for the spatial spectrum and obtain DOA estimates, hyperparameter tuning presents a significant challenge. Improper hyperparameter settings can severely degrade the performance of these algorithms [9].
In recent years, deep learning (DL) has been extensively applied to DOA estimation tasks due to its non-linear capabilities and feature-learning potential [24]. Compared to sparse reconstruction methods, DL-based approaches offer several advantages. First, DOA estimation using a trained deep network bypasses the need for high-dimensional matrix inversion and multiple iterations, leading to a significant reduction in computational complexity. Second, the process does not require hyperparameter tuning, which helps avoid performance degradation due to improper parameter settings. Currently, most DL methods frame the DOA estimation problem as a multi-label classification task, where the deep network outputs the probability of targets being present in different directions. However, this approach is still fundamentally based on the discrete grid model [25,26,27,28]. While increasing the grid resolution can theoretically reduce grid mismatch errors, it also increases the output dimensionality of the network, making the training process more difficult. Although gridless DL-based DOA estimation methods have also been explored, similar to gridless sparse reconstruction approaches, they are typically limited to ULA, SLA [29,30,31], or coprime arrays [32].
In this paper, we first derive the Toeplitz structure of the angular-domain covariance matrix for arbitrary array geometries. Then, we design and train a complex-valued deep neural network (CDNN) to reconstruct this Toeplitz matrix from the sample covariance matrix (SCM) and subsequently employ the Root-MUSIC method to achieve gridless DOA estimation for arbitrary array geometries. The main contributions of this paper are as follows: (1) We represent the steering vector as a Fourier series and derive the Toeplitz structure of the angular-domain covariance matrix for arbitrary array geometries. Furthermore, we establish the relationship between this matrix and the element-domain covariance matrix, providing a theoretical foundation for gridless DOA estimation using SCM in arbitrary array geometries. (2) We propose a novel approach that utilizes CDNN to reconstruct the Toeplitz matrix from SCM and leverages Root-MUSIC for gridless DOA estimation. This method avoids grid mismatch errors, eliminates the need for hyperparameter tuning, and delivers strong performance with low computational complexity. (3) We analyze the impact of the Toeplitz matrix’s dimensionality on DOA estimation performance.
The rest of this paper is organized as follows: Section 2 presents the DOA estimation problem model and derives the Toeplitz structure of the angular-domain covariance matrix from the element-domain covariance matrix. Section 3 describes the training of the CDNN for the subarray of horizontal linear array sorting (HLAS) in the Swellex-96 experiment, to reconstruct the Toeplitz structure of the angular-domain covariance matrix. In Section 4, the performance of the proposed method is assessed through numerical simulations. Section 5 demonstrates the algorithm’s effectiveness using real data obtained from sea trials. Finally, Section 6 provides the conclusions and discussions of this paper.

2. Problem Formulation

In this section, we present the mathematical model for gridless DOA estimation. Specifically, Section 2.1 introduces the general model for the DOA estimation problem in a two-dimensional (2D) plane. Section 2.2 derives the Toeplitz structure of the angular-domain covariance matrix for arbitrary array geometries in the DOA estimation problem.

2.1. Signal Model

In a 2D plane, we consider the scenario where K far-field narrowband plane waves with DOAs θ = [ θ 1 , θ 2 , , θ K ] impinge on an array of M omnidirectional sensors. The mutual coupling effects between sensors are not considered. It is assumed that K < M and the number of sources K is known. The received snapshots of the array can be expressed as
y l = A ( θ ) x l + n l ,
where A ( θ ) = [ a ( θ 1 ) , a ( θ 2 ) , , a ( θ K ) ] C M × K is the steering matrix of the array corresponding to the directions θ . x l C K and n l C M are the signal vector and noise vector at the lth snapshot, respectively, and l = 1 , 2 , , L is the snapshot index. For an array with arbitrary geometry, the steering vector can be expressed as
a ( θ ) = [ e j 2 π f r 1 cos ( θ φ 1 ) j 2 π f r 1 cos ( θ φ 1 ) c c , e j 2 π f r 2 cos ( θ φ 2 ) j 2 π f r 2 cos ( θ φ 2 ) c c , , e j 2 π f r M cos ( θ φ M ) j 2 π f r M cos ( θ φ M ) c c ] T ,
where the superscript T denotes the transpose operator, f is the center frequency of the narrowband signal, and r m , φ m are the polar coordinates of the mth array element, with c being the speed of waves.
The covariance matrix of the received signal in the element domain can be written as
R y = A R x A H + R n ,
where the superscript H denotes the conjugate transpose operator, and R x and R n are the covariance matrices of the source signals and noise, respectively. For uncorrelated sources, R x = d i a g ( α 1 2 , α 2 2 , , α K 2 ) , where α k 2 is the power of the kth source. For simplicity, we abbreviate A ( θ ) as A . In practical scenarios with limited observation data, the SCM R y ^ = 1 L l = 1 L y l y l H is often used to estimate R y . When the noise is Gaussian, R y ^ provides an unbiased estimate of R y .

2.2. Angular-Domain Covariance Matrix

In a 2D plane, the steering vector a ( θ ) exhibits periodicity with respect to θ , with a period of 2 π . Therefore, it can be expanded into a Fourier series in terms of θ , given by
[ a ( θ ) ] m = n = b n m e j n θ n = N N b n m e j n θ ,
where [ a ( θ ) ] m represents the mth element of a ( θ ) , n is the Fourier series index, and b n m is the Fourier coefficient. When b N m 0 , the Fourier series can be truncated at n = N , and the resulting expression in matrix form is
a ( θ ) = B H f ( θ ) ,
where B = [ b 1 , b 2 , , b M ] C ( 2 N + 1 ) × M is the Fourier coefficient matrix, b m = [ b N m , , b 0 m , , b N m ] H is the coefficient vector, and f ( θ ) = [ e j N θ , , 1 , , e j N θ ] T . Therefore, the steering matrix A can be expressed as
A = B H F ,
where F = [ f ( θ 1 ) , f ( θ 2 ) , , f ( θ K ) ] . In the absence of noise, substituting (6) into (3) yields
R y ˜ = B H F R x F H B = B H T ( u ) B ,
where T ( u ) = F R x F H is a Hermitian Toeplitz matrix, and the vector u C 2 N + 1 represents the first row of the matrix T ( u ) . According to [33], the Fourier coefficients b n m = j n J n ( 2 π f r m 2 π f r m c c ) e j n φ m , where J n ( ) denotes the nth-order Bessel function of the first kind. We refer to T ( u ) as the angular-domain covariance matrix. It is evident that T ( u ) has a rank of K and is positive semi-definite. According to the Vandermonde decomposition theorem for semi-definite Toeplitz matrices, F and R x can be uniquely recovered from T ( u ) , enabling gridless DOA estimation [8,34].
It is important to note that cross-correlation between signals, as well as between signals and noise, causes the SCM to deviate from the true element-domain covariance matrix [35,36]. This cross-correlation introduces inaccuracies in the signal subspace, resulting in degraded performance of subspace-based algorithms [36]. By recovering a rank-deficient Toeplitz matrix T ( u ) , the influence of signal correlation on the signal subspace can be mitigated, thereby improving the performance of subspace algorithms. However, directly reconstructing T ( u ) from R y ^ is challenging. Therefore, we propose using a CDNN to learn the mapping between R y ^ and T ( u ) , and subsequently employing the Root-MUSIC algorithm [37], which obtains DOA estimates by utilizing the roots of a polynomial derived from T ( u ) . This approach aims to achieve gridless DOA estimation and improve performance for arbitrary array geometries.

3. Angular-Domain Covariance Matrix Reconstruction Network

In this section, we propose a CDNN to model the mapping between the SCM R y ^ and the angular-domain covariance matrix T ( u ) using the subarray composed of elements #1 to #8 of the HLAS in the Swellex-96 experiment [38]. Section 3.1 introduces the CDNN, Section 3.2 provides the array geometry and parameter settings, Section 3.3 describes the dataset used for training the network, and Section 3.4 presents the proposed network framework along with the training process and results.

3.1. Complex-Valued Deep Neural Network

CDNNs [39] can capture the phase information between the real and imaginary components of complex numbers, which are often crucial in signal processing tasks. As a result, CDNNs have been widely applied in these tasks [40,41,42], achieving better performance compared to real-valued deep neural networks (RDNNs). To achieve better performance in covariance matrix reconstruction with fewer parameters, we employ CDNN. The principles of CDNN are similar to those of RDNN. In the case of a fully connected neural network, it consists of an input layer, an output layer, and multiple cascaded hidden layers. Within each hidden layer, the input undergoes a linear transformation followed by a nonlinear activation function. This process can be expressed as
o = σ ( z ) = σ ( Wo + b ) ,
where W and b are the complex weight matrix and bias vector, respectively, o and o are the complex outputs of the previous layer and the current layer, respectively, and σ ( ) represents the complex nonlinear activation function. The complex matrix multiplication in real space can be expressed as
( z ) ( z ) = ( W ) ( W ) ( W ) ( W ) ( o ) ( o ) + ( b ) ( b ) ,
where ( ) and ( ) denote the real and imaginary parts of the complex number, respectively. According to (8) and (9), a complex-valued layer can be expressed as
h ( o ) = ( h r ( o r ) h i ( o i ) ) + j ( h r ( o i ) + h r ( o i ) ) ,
where o r = ( o ) , o i = ( o ) , h r ( x ) = ( W ) x + ( b ) , h i ( x ) = ( W ) x + ( b ) . Thus, CDNN is an extension of RDNN, and a CDNN of the same scale has twice the number of parameters as an RDNN.

3.2. Array Geometry and Parameter Settings

We use the subarray composed of elements #1 to #8 from HLAS in Swellex-96 to generate a dataset for training the CDNN for angular-domain covariance matrix reconstruction. With element #5 as the origin of the Cartesian coordinate system, the coordinates of the 8-element array are shown in Table 1. The array is approximated as a linear array with unequal element spacing, with element #1 aligned in the direction of azimuth θ = 0 . The DOA is defined as the angle between the end-fire direction of the #1 element and the direction of the incoming source, with the counterclockwise direction considered positive, ranging from 0 to 180 . The center frequency of the narrowband far-field plane wave is set to f = 50 Hz, and the speed of sound in water is assumed to be c = 1490 m/s. The maximum element spacing in the linear array is d max = 0.21 λ , and the minimum element spacing is d min = 0.12 λ .
According to [19,21], when the wavelength of the narrowband signal is fixed, the error between the steering vector approximated by (5) and the true steering vector is related to the distance of the array elements from the coordinate origin, as well as the truncation order N. Figure 1a shows the magnitude of the Fourier coefficients for different element distances when f = 50 Hz, θ = 0 , and φ m = 0 , with the Fourier coefficients expressed in decibels, i.e., 20 lg b n . Figure 1b shows the relationship between the steering vector error and truncation order for different azimuth angles in the array specified in Table 1, with the steering vector error also expressed in decibels, i.e., 20 lg a ( θ ) B H f ( θ ) . We find that the magnitude of the Fourier coefficients decreases with increasing order, so higher truncation orders reduce the approximation error in (4) and (5), making R y ˜ closer to the true element-domain covariance matrix. At the same time, the dimensionality of T ( u ) increases with higher truncation orders, theoretically enhancing the performance of DOA estimation. However, this increase in dimensionality also raises the complexity of the network by expanding the number of output parameters, making the network more challenging to train. Therefore, we set the truncation orders N = 4 , N = 7 and N = 10 in the following experiments and analyze the effect of truncation order on the performance of CDNN in reconstructing T ( u ) through simulation results.

3.3. Dataset

We use MATLAB to generate the dataset for training the CDNN according to the array geometry and parameters provided in Section 3.2. The number of incoherent sources in the dataset is denoted as K = 2 , 3 , with data corresponding to K = 2 and K = 3 each comprising half of the dataset. In the data numbered i, the DOA θ i k for the kth incoherent source are randomly generated within the range of [ 0 , π ] , following an independent uniform distribution. The source amplitudes α i 1 = 1 , and α i k , k 1 follow independent uniform distributions over [ 0 , 1 ] . To enhance the robustness of the CDNN in real noise environments, we introduce spatially and temporally uncorrelated Gaussian white noise into the data, with the SNR uniformly distributed over 0 , 20 dB. For multiple sources, the SNR is defined as k = 1 K α k 2 σ 2 , where σ 2 represents the noise power. Substituting the above parameters into (1), we obtain the array snapshot signal Y i = [ y i 1 , y i 2 , , y i L ] , where y i l C N ( 0 , A i R x i A i H + k = 1 K α i k 2 S N R i I ) , R x i = d i a g ( α i 1 2 , , α i K 2 ) , A i = [ a ( θ i 1 ) , , a ( θ i K ) ] , and I is the identity matrix. To enhance the randomness of the dataset, we set the number of snapshots for ith data sample as L i = mod ( i , 10 ) × 10 , where mod ( x , y ) represents the remainder when dividing x by y.
We model the mapping from R y ^ to T ( u ) as a regression task, with R y ^ as the input and T ( u ) as the target label. The input data R y ^ is the SCM of the array’s snapshot signal Y , which is a complex-valued matrix of dimension M × M . For the ith input data R y ^ i = Y i Y i H Y i Y i H L i L i , the corresponding label is the ideal angular-domain covariance matrix T ( u ) i = F i H R x i F i , where F i = [ f ( θ i 1 ) , f ( θ i 2 ) , , f ( θ i K ) ] is a complex-valued matrix of dimension ( 2 N + 1 ) × ( 2 N + 1 ) . Therefore, the ith data pair in the dataset is represented as R y ^ i , T ( u ) i , forming a dataset ε t r a = R y ^ 1 , T ( u ) 1 , R y ^ 2 , T ( u ) 2 , , R y ^ E t r a , T ( u ) E t r a of size E t r a = 120 , 000 to be used as the training set. Similarly, we generate a validation set ε v a l of size E v a l = 10 , 000 for model validation during training. Additionally, we construct a test set ε t e s t of size E t e s t = 10 , 000 to evaluate the performance of the trained CDNN.

3.4. Network Architecture and Training

As shown in Figure 2, the proposed angular-domain covariance matrix reconstruction CDNN consists of input, output, and dense layers. Since the input SCM is a Hermitian matrix, its upper triangular elements cover all the matrix information. Therefore, the input layer takes a vector r ^ y C M ( M + 1 ) M ( M + 1 ) 2 2 composed of the upper triangular part of the SCM. The output layer produces a vector u C 2 N + 1 , composed of the first row elements of T ( u ) . After post-processing f p ( u ) = T ( u ) , the reconstructed angular-domain covariance matrix is obtained. The CDNN can be represented as
T ( u ) ^ = f C D N N R y ^ = f p f 7 f 6 f 1 r ^ y ,
where f i ( ) i = 1 , 2 , , 6 is followed by Complex ReLU (CReLU) as the activation function. In f i ( ) i = 1 , 2 , 3 , 4 , Complex LayerNorm (CLayerNorm) is inserted between the fully connected layers and CReLU for normalization. A Dropout layer with a dropout rate of 0.3 is applied after f 2 ( ) to mitigate overfitting. The CReLU and CLayerNorm functions can be expressed as
CRelu ( x ) = Relu ( ( x ) ) + j Relu ( ( x ) ) ,
CNorm ( x ) = x μ σ 2 ε ,
where x = x 1 , x 2 , , x D T is the output of a certain layer in the network, μ = 1 D d = 1 D x d , σ 2 = 1 D d = 1 D x d μ 2 , and ε is a small value to prevent division by zero in (13).
Since T u is expressed in terms of power, we chose the mean absolute error between f C D N N R y ^ i and T u i as the loss function during network training to ensure that the network’s output closely approximates the true angular-domain covariance matrix. We use the Adam optimizer [43] to optimize the network parameters, with an initial learning rate of 0.01 and parameters β 1 = 0.9 , β 2 = 0.999 . Additionally, a 20-epoch warmup is introduced to stabilize network convergence. The training process adopts mini-batch gradient descent with a batch size of 1024. At the beginning of each epoch, the training data are shuffled to ensure that the distribution of each mini-batch is random. Then, the data are divided into multiple mini-batches, with each mini-batch undergoing forward propagation, loss computation, backpropagation, and parameter updates. To improve training efficiency and prevent overfitting, we introduce ReduceLROnPlateau and Early Stopping techniques. If the loss on the validation set does not decrease within 10 epochs, ReduceLROnPlateau reduces the learning rate by half, allowing the optimizer to search the parameter space more finely. If the loss on the validation set does not improve for 20 consecutive epochs, Early Stopping terminates training to prevent overfitting on the training set.
Figure 3 illustrates the changes in training loss, validation loss, and learning rate over epochs for different values of N = 4 , N = 7 and N = 10 . Table 2 summarizes the training results of the three CDNN models, including the final losses on the training and validation sets, as well as the losses on the test set after training. Based on the results shown in Table 2 and Figure 3, it can be observed that all three CDNN models converge well and do not exhibit significant overfitting. Additionally, since we only varied the output dimension across these three CDNN models, we find that while all models achieve good training performance, the CDNN with a higher output dimension tends to exhibit larger average errors on the test set.

4. Simulation

In this section, we will analyze the performance of our proposed method through numerical simulations and compare it with existing DOA estimation methods such as MUSIC [5], SPICE [12], and SBL [44]. First, we compare the gridless DOA estimation performance of CDNN models with truncation orders N = 4 , N = 7 and N = 10 under different SNR conditions. Then, we present the DOA estimation results of these methods for varying numbers of sources within the 0 to 180 range. Finally, we evaluate the impact of the SNR, target angular separation, and the number of snapshots on the performance of the aforementioned algorithms in uncorrelated source scenarios through Monte Carlo simulations.
We consider K uncorrelated narrowband acoustic sources with equal power at a center frequency of f = 50 Hz, whose DOAs are θ = [ θ 1 , θ 2 , , θ K ] , assuming that the number of sources is known. In the Monte Carlo simulations, the number of sources is set to K = 2 . The performance of DOA estimation is evaluated using resolution probability (RP) and root mean square error (RMSE) [9,45,46]. The Cramer–Rao Bound (CRB) provides the lower bound of RMSE for unbiased estimators, serving as a reference for the DOA estimation algorithms studied in this paper [47]. For grid-based methods, DOA estimates are obtained from the peaks of the spatial spectrum. Specifically, the spatial spectrum obtained from the DOA estimation algorithm is first normalized, and then peaks with power greater than −10 dB are identified, where each peak corresponds to a target. If the algorithm identifies two targets and the DOA estimates θ ^ = [ θ 1 ^ , θ 2 ^ ] satisfy θ 1 ^ θ 1 + θ 2 ^ θ 2 < θ 1 θ 2 , the algorithm is considered to have successfully resolved the targets. To ensure the accuracy of the simulations, we only simulate the RMSE of the algorithm for cases where R P 0.5 . The RMSE of DOA estimation is defined as 1 2 T t = 1 T ( θ 1 t ^ θ 1 ) 2 + t = 1 T ( θ 2 t ^ θ 2 ) 2 , where T is the number of successful resolutions, and θ t ^ = [ θ 1 t ^ , θ 2 t ^ ] is the estimate of θ in the tth successful simulation. Additionally, the number of Monte Carlo simulations is set to 1000. For SPICE and SBL, the number of iterations is set to 10 and 1000, respectively. For the MUSIC algorithm, we divide the azimuth angle grid at 0 . 1 intervals, while for SPICE and SBL, the grid is divided at 1 intervals.

4.1. Impact of N on DOA Estimation Performance

Figure 4 illustrates the relationship between RMSE and RP with respect to the SNR for the gridless DOA estimation performance of CDNNs with truncation orders N = 4 , N = 7 , and N = 10 under different scenarios: θ = 85 , 95 , θ = 80 , 100 and θ = 70 , 110 . In the simulation, the noise is modeled as a zero-mean Gaussian white random process, assumed to be spatially and temporally uncorrelated. The SNR is defined as α 1 2 + α 2 2 σ 2 , where α 1 2 , α 2 2 and σ 2 represent the power of the two sources and the noise at the array elements, respectively. The number of snapshots is set to 50. As shown in Figure 4d,f, even when the angular separation between two targets is large and the SNR is high, the CDNN with N = 10 still occasionally fails to distinguish the two targets. Additionally, as depicted in Figure 4b, when the angular separation is small, the RP for N = 10 deteriorates more significantly as the SNR decreases. Conversely, as seen in Figure 4c,e, the CDNN with N = 4 exhibits higher robustness to Gaussian white noise when the target angular separations are 20 and 40 , providing better performance at low SNR levels while maintaining similar performance to N = 7 and N = 10 at higher SNRs. However, as shown in Figure 4a, when the target angular separation decreases to 10 and the SNR exceeds 5 dB, the DOA estimation error for the CDNN with N = 4 becomes the largest. In summary, we observe that while N = 10 provides better resolution for closely spaced targets, its DOA estimation performance becomes unstable when the target angular separation is large and the SNR is high. The instability observed at N = 10 is due to the network needing to fit outputs of a higher dimensionality, which can lead to higher fitting errors. On the other hand, N = 4 delivers superior DOA estimation performance under conditions of larger angular separation and a lower SNR, but its resolution is limited when the angular separation is small. To balance the DOA estimation performance across varying angular separations, we choose the CDNN with N = 7 , where the corresponding angular-domain covariance matrix T ( u ) is a 15 × 15 Toeplitz matrix.

4.2. DOA Estimation Results

In this section, we present the DOA estimation results using the methods under study for scenarios with K 1 , 2 , 3 sources within an angular range of from 0 to 180 . When the number of sources is 2 or 3, we set the angular separation between adjacent sources as Δ θ = 40 , meaning that all source DOAs are θ = θ 1 , θ 1 + Δ θ , , θ 1 + ( K 1 ) Δ θ . We consider the case where θ satisfying θ 1 0 : 0.1 : 180 ( K 1 ) Δ θ , with the SNR set to 10 dB, and with the noise being spatiotemporally uncorrelated Gaussian white noise. The number of snapshots is set to 50. The DOA estimation results for various methods are shown in Figure 5, and the RMSE for different angular ranges is provided in Table 3. When the number of sources K = 1 , the results in Table 3 indicate that the proposed method and MUSIC achieve more accurate DOA estimates than SPICE and SBL. Comparing Figure 5a,d,g,j, it can be seen that even when dividing the grid densely with a separation of 1 , grid mismatch errors still severely affect the performance of SPICE and SBL. When the number of sources K = 2 and K = 3 , and all sources are within the angular range 45 , 135 , all four methods can obtain relatively accurate DOA estimation results. However, when the range of source DOAs is confined to 0 , 180 , the proposed method shows a significant RMSE advantage over the other three methods. According to the simulation results shown in Figure 5, when all sources are far from the end-fire direction, all four methods can effectively distinguish targets and achieve accurate DOA estimation. However, when a source is close to the end-fire direction, the DOA estimation performance of the other three methods deteriorates significantly, making it difficult to distinguish adjacent targets or leading to large DOA estimation errors. In contrast, the performance degradation of the proposed method is relatively mild.

4.3. Relationship between Algorithm Performance and SNR

We analyze the relationship between algorithm performance and the SNR under two scenarios: θ = 80 , 100 and θ = 70 , 110 . The other definitions and settings in the simulation are the same as those in Section 4.1, and the simulation results are presented in Figure 6. When θ = 80 , 100 , as shown in Figure 6a, the proposed method achieves a lower RMSE than the other three methods when the SNR is below 14 dB. Additionally, as illustrated in Figure 6b, the proposed method maintains an RP above 0.6 for distinguishing the two sources across the SNR range of −10 to 20 dB. In contrast, the RP of the other three methods rapidly decreases with a decreasing SNR, falling below 0.2 at an SNR of 0 dB, indicating that they almost fail to distinguish the two sources. For the scenario with θ = 70 , 110 , Figure 6c,d show that in the SNR range of −2 to 12 dB, the proposed method achieves comparable or lower RMSE compared to the other methods, while all methods maintain nearly perfect RP values close to 1, successfully distinguishing the two targets. When the SNR drops below −4 dB, although the RMSE of the proposed method exceeds that of SBL and MUSIC, it still achieves the highest RP for source resolution, with RP remaining above 0.7 even when the SNR decreases to −10 dB.

4.4. Relationship between Algorithm Performance and Angle Separation

We set θ 1 = 90 Δ θ Δ θ 2 2 , θ 2 = 90 + Δ θ Δ θ 2 2 , where Δ θ represents the angular separation between the two sources. The SNR in the simulation is set to 10 dB, and other conditions remain the same as in Section 4.1. The relationship between the angle separation and the DOA estimation performance, as measured by RP and RMSE, is shown in Figure 7. When the angular separation between the two sources is between 20 and 50 , the proposed method shows similar RMSE values compared to the other three methods, with the RMSE difference among the four methods being less than 0 . 2 . Additionally, all four methods are able to distinguish between the two sources with an RP close to 1. However, when the angular separation between the two sources is less than 20 , the proposed method exhibits the lowest DOA estimation RMSE and can still differentiate between the two sources with an RP close to 1. In contrast, the RP of the other three methods decreases rapidly with decreasing angular separation. When the angular separation is reduced to 10 , the RP of all three existing methods drops below 0.2, making them almost unable to distinguish between the sources. The simulation results demonstrate that the proposed method outperforms the existing methods in DOA estimation when the angular separation is small.

4.5. Impact of Snapshot Number on Algorithm Performance

In this section, we analyze the impact of the number of snapshots on the algorithm’s performance. Figure 8a,b illustrate the RMSE and RP of the proposed method and the three existing methods for DOA estimation under different snapshot numbers. In the simulation, we set θ = 78 , 102 and the SNR to 10 dB, with other conditions being the same as in Section 4.1. As shown in Figure 8, under the simulation conditions, when the number of snapshots exceeds 50, the proposed method exhibits the highest RMSE, while all four methods are able to distinguish the two sources with an RP close to 1. When the number of snapshots is less than 30, the RMSE of MUSIC and SPICE increases, and their RP decreases rapidly as the number of snapshots decreases. When the number of snapshots falls below 10, the RP of MUSIC and SPICE drops below 0.5, indicating a high probability of failure in accurate DOA estimation. At this point, the proposed method can still distinguish the two sources with an RP close to 1, and it also achieves lower RMSE compared to the existing three methods. Based on the simulation results, we conclude that the proposed method outperforms the other three existing methods when the number of snapshots is low, demonstrating higher adaptability to fewer snapshots.

5. Swellex-96 Event S59 Experimental Results

In this section, we validate the effectiveness of the proposed method using experimental data from the Swellex-96 Event S59. The SWellEx-96 experiment [38] was conducted in shallow waters approximately 12 km off the tip of Point Loma near San Diego, California, from 10 to 18 May 1996. As shown in Figure 9, the event scenario depicts the source’s trajectory in blue and the interferer’s trajectory in red. The markers on the tracks represent GPS-recorded coordinates at 5-minute intervals, and the entire experiment lasted for 65 min. During the experiment, the source continuously emitted signals at various frequencies, with the higher power frequencies being 49, 64, 79, 94, 112, 130, 148, 166, 201, 235, 283, 338, and 388 Hz. The HLA subarray used in our study had a sampling frequency of f s = 3276.8 Hz, and the subarray formed by elements #1 to #8 effectively met the far-field condition during the experiment.
The data received by the 8-element subarray are denoted as Y = Y ( : , 1 ) , Y ( : , 2 ) , , Y ( : , 8 ) , where the column vector Y ( : , m ) represents the data received by the mth array element. We divided the data into Q = 361 segments, each starting at time t s t a r t t 1 , t 2 , , t Q = 0 s : 10 s : 3600 s and lasting for a duration T s e g = 30 s . The data received by the mth array element during the qth segment can be expressed as y ¯ m q = Y f s t q + 1 : f s t q + T s e g , m , where denotes the floor function. Each segment was further divided into L = 100 snapshots, with each snapshot containing N s n a p = 2048 sampling points. The duration of each snapshot was T s n a p = 0.625 s, and the overlap between consecutive snapshots was η = 53 % . Therefore, the lth snapshot within y ¯ m q is represented as y ˜ m q l = y ¯ m q ( l 1 ) ( 1 η ) f s T s n a p + 1 : ( l 1 ) ( 1 η ) f s T s n a p + N s n a p . We will apply different DOA estimation methods based on the snapshot model described above to perform narrowband DOA estimation at a frequency of f c = 50 Hz for each data segment, thereby obtaining the DOA trajectories.
Figure 10 presents the bearing time records (BTRs) obtained using different DOA estimation methods based on the Swellex-96 Event S59 data, along with the GPS-recorded tracks of the source and interferer. As shown in Figure 10b, for CBF, prior to 1800 s, due to the small angular separation between the source and the interferer, the two targets cannot be distinguished. After 2400 s, although the angular separation between the source and interferer increases and two peaks appear in the CBF spectrum, high sidelobes severely affect the performance of DOA estimation. As shown in Figure 10d–f, the three existing methods demonstrate better resolution compared to CBF, allowing the source and interferer tracks to be distinguished. However, all three methods exhibit high levels of pseudospectra around the DOA track, which negatively impact target trajectory identification. Moreover, for SPICE and SBL, the interferer’s DOA is close to the end-fire direction of the array, making it difficult to identify the interferer prior to 1800 s in most cases. As shown in Figure 10c, the proposed method directly achieves gridless DOA estimation, avoiding the influence of pseudospectra on BTR trajectories. Throughout the entire BTR process, the proposed method consistently identifies both the source and interferer with a high probability, making it more effective for trajectory determination. Figure 11 illustrates the results of processing the Swellex-96 Event S59 data using different methods. The RMSE was calculated by interpolating the source trajectory with its GPS angles, and the data were processed using an AMD Ryzen 76800H CPU. The results show that the proposed method significantly outperforms the other three methods in terms of RP while maintaining RMSE values similar to the other methods. Additionally, the proposed method exhibits complexity comparable to MUSIC and is significantly more efficient than SPICE and SBL, which require iterative operations.

6. Conclusions and Discussion

In this paper, we propose a novel gridless DOA estimation method for arbitrary array geometries based on a CDNN. First, by expanding the steering vector into a Fourier series, the relationship between the element-domain covariance matrix and the angular-domain-covariance matrix for arbitrary arrays is derived. Then, a CDNN is designed and trained to reconstruct the Toeplitz-structured angular-domain covariance matrix from the SCM. Finally, Root-MUSIC is applied to the reconstructed angular-domain covariance matrix to obtain gridless DOA estimates. Our proposed method demonstrates superior resolution and DOA estimation accuracy compared to existing methods, particularly when target angles are closely spaced. The proposed method does not require the setting of any hyperparameters and exhibits high adaptability to different snapshot counts. Additionally, its lower computational complexity makes it well suited for applications that demand real-time performance or deployment on low-power platforms with limited computational resources. In this paper, we only analyze the performance of the proposed method for single-frequency narrowband signals. In our future work, we will continue to optimize the network structure and training process to make the proposed method applicable to multiple frequencies without needing prior information about the number of sources, thereby enhancing its practical utility in real-world scenarios, such as providing accurate angular estimates for localization and object tracking.

Author Contributions

Conceptualization, Y.C. and T.Z.; methodology, T.Z.; writing—review and editing, Y.C., T.Z. and Q.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (62371393, 62271397) and the Fundamental Research Funds for the Central Universities (D5000240231).

Data Availability Statement

Data are contained within this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Godara, L.C. Application of antenna arrays to mobile communications. II. Beam-forming and direction-of-arrival considerations. Proc. IEEE 1997, 85, 1195–1245. [Google Scholar] [CrossRef]
  2. Massa, A.; Rocca, P.; Oliveri, G. Compressive Sensing in Electromagnetics—A Review. IEEE Antennas Propag. Mag. 2015, 57, 224–238. [Google Scholar] [CrossRef]
  3. Trees, H.L.V. Optimum Array Processing: Part IV of Detection, Estimation, and Modulation Theory; Wiley: Hoboken, NJ, USA, 2004; pp. 17–79. [Google Scholar]
  4. Capon, J. High-resolution frequency-wavenumber spectrum analysis. Proc. IEEE 1969, 57, 1408–1418. [Google Scholar] [CrossRef]
  5. Schmidt, R. Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 1986, 34, 276–280. [Google Scholar] [CrossRef]
  6. Roy, R.; Paulraj, A.; Kailath, T. ESPRIT—A subspace rotation approach to estimation of parameters of cisoids in noise. IEEE Trans. Acoust., Speech, Signal Process. 1986, 34, 1340–1342. [Google Scholar] [CrossRef]
  7. Thomas, J.K.; Scharf, L.L.; Tufts, D.W. The probability of a subspace swap in the SVD. IEEE Trans. Signal Process. 1995, 43, 730–736. [Google Scholar] [CrossRef]
  8. Yang, Z.; Li, J.; Stoica, P.; Xie, L. Chapter 11 - Sparse methods for direction-of-arrival estimation. In Array, Radar and Communications Engineering; Chellappa, R., Theodoridis, S., Eds.; Academic Press: Cambridge, MA, USA, 2018; pp. 509–581. [Google Scholar]
  9. Zhang, G.; Liu, K.; Sun, S.; Fu, J.; Wang, J. DOA estimation method for underwater acoustic signals based on two-dimensional power distribution (TPD) for few element array. Appl. Acoust. 2021, 184, 108352. [Google Scholar] [CrossRef]
  10. Zhang, Y.; Zhang, L.; Han, J.; Ban, Z.; Yang, Y. A new DOA estimation algorithm based on compressed sensing. Cluster Comput. 2019, 22, 895–903. [Google Scholar]
  11. Li, X.; Ma, X.; Yan, S.; Hou, C. Single snapshot DOA estimation by compressive sampling. Appl. Acoust. 2013, 74, 926–930. [Google Scholar] [CrossRef]
  12. Stoica, P.; Babu, P.; Li, J. SPICE: A Sparse Covariance-Based Estimation Method for Array Processing. IEEE Trans. Signal Process. 2011, 59, 629–638. [Google Scholar] [CrossRef]
  13. Tan, Z.; Yang, P.; Nehorai, A. Joint Sparse Recovery Method for Compressed Sensing with Structured Dictionary Mismatches. IEEE Trans. Signal Process. 2014, 62, 4997–5008. [Google Scholar] [CrossRef]
  14. Wu, X.; Zhu, W.; Yan, J. Direction of Arrival Estimation for Off-Grid Signals Based on Sparse Bayesian Learning. IEEE Sens. J. 2014, 16, 2004–2016. [Google Scholar] [CrossRef]
  15. Wang, Q.; Yu, H.; Li, J.; Chen, F. Adaptive Grid Refinement Method for DOA Estimation via Sparse Bayesian Learning. IEEE J. Ocean Eng. 2023, 48, 806–819. [Google Scholar] [CrossRef]
  16. Jagannath, R.; Hari, K.V.S. Block Sparse Estimator for Grid Matching in Single Snapshot DoA Estimation. IEEE Signal Process. Lett. 2013, 11, 1038–1041. [Google Scholar] [CrossRef]
  17. Xenaki, A.; Gerstoft, P. Grid-free compressive beamforming. J. Acoust. Soc. Am. 2015, 137, 1923–1935. [Google Scholar] [CrossRef]
  18. Zhou, C.; Gu, Y.; Fan, X.; Shi, Z.; Mao, G.; Zhang, Y. Direction-of-Arrival Estimation for Coprime Array via Virtual Array Interpolation. IEEE Trans. Signal Process. 2018, 66, 5956–5971. [Google Scholar] [CrossRef]
  19. Chu, Z.; Liu, Y.; Yang, Y.; Yang, Y. A preliminary study on two-dimensional grid-free compressive beamforming for arbitrary planar array geometries. J. Acoust. Soc. Am. 2021, 149, 3751–3757. [Google Scholar] [CrossRef]
  20. Raj, A.G.; McClellan, J.H. Super-resolution DOA Estimation for Arbitrary Array Geometries Using a Single Noisy Snapshot. In Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019. [Google Scholar]
  21. Raj, A.G.; McClellan, J.H. Single Snapshot Super-Resolution DOA Estimation for Arbitrary Array Geometries. IEEE Signal Process. Lett. 2019, 26, 119–123. [Google Scholar]
  22. Yang, Y.; Chu, Z.; Yin, S. Two-dimensional grid-free compressive beamforming with spherical microphone arrays. Mech. Syst. Signal Proc. 2022, 169, 108642. [Google Scholar] [CrossRef]
  23. Mahata, K.; Hyder, M.M. Grid-less T.V minimization for DOA estimation, Signal Processing. Signal Process. 2017, 132, 155–164. [Google Scholar] [CrossRef]
  24. Grumiaux, P.; Kitić, S.; Girin, L.; Guérin, A. A survey of sound source localization with deep learning methods. J. Acoust. Soc. Am. 2022, 152, 107–151. [Google Scholar] [CrossRef]
  25. Papageorgiou, G.K.; Sellathurai, M.; Eldar, Y.C. Deep Networks for Direction-of-Arrival Estimation in Low SNR. IEEE Trans. Signal Process. 2021, 69, 3714–3729. [Google Scholar] [CrossRef]
  26. Liu, Y.; Chen, H.; Wang, B. DOA estimation based on CNN for underwater acoustic array. Appl. Acoust. 2021, 172, 107594. [Google Scholar] [CrossRef]
  27. Nie, W.; Zhang, X.; Xu, J.; Guo, L.; Yan, Y. Adaptive Direction-of-Arrival Estimation Using Deep Neural Network in Marine Acoustic Environment. IEEE Sens. J. 2023, 23, 15093–15105. [Google Scholar] [CrossRef]
  28. Liang, C.; Liu, M.; Li, Y.; Wang, Y.; Hu, X. LDnADMM-Net: A Denoising Unfolded Deep Neural Network for Direction-of-Arrival Estimations in A Low Signal-to-Noise Ratio. Remote Sens. 2024, 16, 554. [Google Scholar] [CrossRef]
  29. Wu, X.; Yang, X.; Jia, X.; Tian, F. A Gridless DOA Estimation Method Based on Convolutional Neural Network with Toeplitz Prior. IEEE Signal Process. Lett. 2022, 29, 1247–1251. [Google Scholar] [CrossRef]
  30. Gao, S.; Ma, H.; Liu, H.; Yang, J.; Yang, Y. A Gridless DOA Estimation Method for Sparse Sensor Array. Remote Sens. 2023, 15, 5281. [Google Scholar] [CrossRef]
  31. Wu, X.; Wang, J.; Yang, X.; Tian, F. A Gridless DOA Estimation Method Based on Residual Attention Network and Transfer Learning. IEEE Trans. Veh. Technol. 2024, 73, 9103–9108. [Google Scholar] [CrossRef]
  32. Cui, Y.; Yang, F.; Zhou, M.; Hao, L.; Wang, J.; Sun, H.; Kong, A.; Qi, J. Gridless Underdetermined DOA Estimation for Mobile Agents with Limited Snapshots Based on Deep Convolutional Generative Adversarial Network. Remote Sens. 2024, 16, 626. [Google Scholar] [CrossRef]
  33. Huang, G.; Benesty, J.; Chen, J. On the Design of Frequency-Invariant Beampatterns with Uniform Circular Microphone Arrays. IEEE-ACM Trans. Audio Speech Lang. 2017, 25, 1140–1153. [Google Scholar] [CrossRef]
  34. Tang, G.; Bhaskar, B.N.; Shah, P.; Recht, B. Compressed Sensing Off the Grid. IEEE Trans. Inf. Theory 2013, 59, 7465–7490. [Google Scholar] [CrossRef]
  35. Wu, X.; Zhu, W.; Yan, J. A Toeplitz Covariance Matrix Reconstruction Approach for Direction-of-Arrival Estimation. IEEE Trans. Veh. Technol. 2017, 66, 8223–8237. [Google Scholar] [CrossRef]
  36. Pal, P.; Vaidyanathan, P.P. A Grid-Less Approach to Underdetermined Direction of Arrival Estimation Via Low Rank Matrix Denoising. IEEE Signal Process. Lett. 2014, 21, 737–741. [Google Scholar] [CrossRef]
  37. Rao, B.D.; Hari, K.V.S. Performance analysis of Root-Music. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 1939–1949. [Google Scholar] [CrossRef]
  38. The SWellEX-96 Experiment. 1996. Available online: http://swellex96.ucsd.edu/ (accessed on 5 September 2024).
  39. Trabelsi, C.; Bilaniuk, O.; Serdyuk, D.; Subramanian, S.; Santos, J.F.; Mehri, S.; Rostamzadeh, N.; Bengio, Y.; Pal, C.J. Deep Complex Networks. arXiv 2017, arXiv:1705.09792. [Google Scholar]
  40. Mohammadzadeh, S.; Nascimento, V.H.; Lamare, R.C.; Hajarolasvadi, N. Robust Beamforming Based on Complex-Valued Convolutional Neural Networks for Sensor Arrays. IEEE Signal Process. Lett. 2022, 29, 2108–2112. [Google Scholar] [CrossRef]
  41. Zhang, Y.; Zeng, R.; Zhang, S.; Wang, J.; Wu, Y. Complex-Valued Neural Network with Multistep Training for Single-Snapshot DOA Estimation. IEEE Geosci. Remote Sens. Lett. 2024, 21, 1–5. [Google Scholar] [CrossRef]
  42. Fan, Z.; Tu, Y.; Lin, Y.; Shi, Q. Class-Incremental Learning for Recognition of Complex-Valued Signals. IEEE Trans. Cogn. Commun. Netw. 2024, 10, 417–428. [Google Scholar] [CrossRef]
  43. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  44. Xenaki, A.; Boldt, J.B.; Christensen, M.G. Sound source localization and speech enhancement with sparse Bayesian learning beamforming. J. Acoust. Soc. Am. 2018, 143, 2912–3921. [Google Scholar] [CrossRef]
  45. Zhou, T.; He, Z.; Shi, Q.; Lin, C.; Zhang, S. Multisnapshot High-Resolution Gridless DOA Estimation for Uniform Circular Arrays. IEEE Signal Process. Lett. 2024, 31, 1705–1709. [Google Scholar] [CrossRef]
  46. Liu, Z.; Zhang, Y.; Wang, W.; Li, X.; Li, H.; Shi, W.; Ali, W. Infinite Weighted p-Norm Sparse Iterative DOA Estimation via Acoustic Vector Sensor Array under Impulsive Noise. J. Mar. Sci. Eng. 2023, 11, 1798. [Google Scholar] [CrossRef]
  47. Stoica, P.; Nehorai, A. MUSIC, maximum likelihood, and Cramer-Rao bound. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 720–741. [Google Scholar] [CrossRef]
Figure 1. (a) Magnitude of Fourier coefficients at various orders. (b) Relationship between array steering vector error and N.
Figure 1. (a) Magnitude of Fourier coefficients at various orders. (b) Relationship between array steering vector error and N.
Remotesensing 16 03752 g001
Figure 2. Angular-domain covariance matrix reconstruction network architecture.
Figure 2. Angular-domain covariance matrix reconstruction network architecture.
Remotesensing 16 03752 g002
Figure 3. Training loss, validation loss, and learning rate variation with epochs for the following: (a) N = 4 . (b) N = 7 . (c) N = 10 .
Figure 3. Training loss, validation loss, and learning rate variation with epochs for the following: (a) N = 4 . (b) N = 7 . (c) N = 10 .
Remotesensing 16 03752 g003
Figure 4. DOA estimation performance of CDNNs with truncation orders N = 4 , N = 7 , and N = 10 at different target angular separations. (a) RMSE and (b) RP for θ = 85 , 95 . (c) RMSE and (d) RP for θ = 80 , 100 . (e) RMSE and (f) RP for θ = 70 , 110 .
Figure 4. DOA estimation performance of CDNNs with truncation orders N = 4 , N = 7 , and N = 10 at different target angular separations. (a) RMSE and (b) RP for θ = 85 , 95 . (c) RMSE and (d) RP for θ = 80 , 100 . (e) RMSE and (f) RP for θ = 70 , 110 .
Remotesensing 16 03752 g004
Figure 5. DOA estimation results. The proposed method, source numbers (a) K = 1 , (b) K = 2 , and (c) K = 3 . MUSIC, source numbers (d) K = 1 , (e) K = 2 , and (f) K = 3 . SPICE, source numbers (g) K = 1 , (h) K = 2 , and (i) K = 3 . SBL, source numbers (j) K = 1 , (k) K = 2 , and (l) K = 3 .
Figure 5. DOA estimation results. The proposed method, source numbers (a) K = 1 , (b) K = 2 , and (c) K = 3 . MUSIC, source numbers (d) K = 1 , (e) K = 2 , and (f) K = 3 . SPICE, source numbers (g) K = 1 , (h) K = 2 , and (i) K = 3 . SBL, source numbers (j) K = 1 , (k) K = 2 , and (l) K = 3 .
Remotesensing 16 03752 g005aRemotesensing 16 03752 g005b
Figure 6. Relationship between SNR and both RMSE and RP under spatio-temporal Gaussian white noise conditions.(a) RMSE and (b) RP for θ = 80 , 100 . (c) RMSE and (d) RP for θ = 70 , 110 .
Figure 6. Relationship between SNR and both RMSE and RP under spatio-temporal Gaussian white noise conditions.(a) RMSE and (b) RP for θ = 80 , 100 . (c) RMSE and (d) RP for θ = 70 , 110 .
Remotesensing 16 03752 g006
Figure 7. Relationship between RP and RMSE with respect to angle separation. (a) RMSE. (b) RP.
Figure 7. Relationship between RP and RMSE with respect to angle separation. (a) RMSE. (b) RP.
Remotesensing 16 03752 g007
Figure 8. Algorithm performance under different snapshot conditions. (a) RMSE. (b) RP.
Figure 8. Algorithm performance under different snapshot conditions. (a) RMSE. (b) RP.
Remotesensing 16 03752 g008
Figure 9. Schematic of the Swellex-96 Event S59 experiment scenario [38].
Figure 9. Schematic of the Swellex-96 Event S59 experiment scenario [38].
Remotesensing 16 03752 g009
Figure 10. BTR results using different methods. (a) GPS. (b) CBF. (c) Proposed method. (d) MUSIC. (e) SPICE. (f) SBL.
Figure 10. BTR results using different methods. (a) GPS. (b) CBF. (c) Proposed method. (d) MUSIC. (e) SPICE. (f) SBL.
Remotesensing 16 03752 g010aRemotesensing 16 03752 g010b
Figure 11. Processing results of Swellex-96 Event S59 data using different methods. (a) RP. (b) RMSE. (c) CPU time.
Figure 11. Processing results of Swellex-96 Event S59 data using different methods. (a) RP. (b) RMSE. (c) CPU time.
Remotesensing 16 03752 g011
Table 1. Array Coordinates.
Table 1. Array Coordinates.
Element No.#1#2#3#4#5#6#7#8
x (m)11.939.266.503.410−3.75−7.87−12.43
y (m)−10.4−8.24−5.78−3.0603.407.3411.62
Table 2. CDNN training results.
Table 2. CDNN training results.
Training LossValidation LossTest Loss
N = 4 0.07500.07620.0766
N = 7 0.11850.11870.1197
N = 10 0.16760.16850.1692
Table 3. RMSE of DOA estimation.
Table 3. RMSE of DOA estimation.
KThe Range of θ k , k = 1 , 2 , , K Proposed MethodMUSICSPICESBL
1 [ 45 , 135 ] 0.12780.10980.26760.2678
[ 0 , 180 ] 0.22390.18851.00600.6724
2 [ 45 , 135 ] 0.90510.82790.93400.9146
[ 0 , 180 ] 2.02942.99063.92323.2688
3 [ 45 , 135 ] 2.48442.26392.18212.0246
[ 0 , 180 ] 5.06329.11948.66877.1800
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, Y.; Zhou, T.; Zhang, Q. Gridless DOA Estimation Method for Arbitrary Array Geometries Based on Complex-Valued Deep Neural Networks. Remote Sens. 2024, 16, 3752. https://doi.org/10.3390/rs16193752

AMA Style

Cao Y, Zhou T, Zhang Q. Gridless DOA Estimation Method for Arbitrary Array Geometries Based on Complex-Valued Deep Neural Networks. Remote Sensing. 2024; 16(19):3752. https://doi.org/10.3390/rs16193752

Chicago/Turabian Style

Cao, Yuan, Tianjun Zhou, and Qunfei Zhang. 2024. "Gridless DOA Estimation Method for Arbitrary Array Geometries Based on Complex-Valued Deep Neural Networks" Remote Sensing 16, no. 19: 3752. https://doi.org/10.3390/rs16193752

APA Style

Cao, Y., Zhou, T., & Zhang, Q. (2024). Gridless DOA Estimation Method for Arbitrary Array Geometries Based on Complex-Valued Deep Neural Networks. Remote Sensing, 16(19), 3752. https://doi.org/10.3390/rs16193752

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop