[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
Infinite-Dimensional Quantum Entropy: The Unified Entropy Case
Previous Article in Journal
Quantum Collapse and Computation in an Everett Multiverse
Previous Article in Special Issue
On the Stress–Strength Reliability of Transmuted GEV Random Variables with Applications to Financial Assets Selection
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Semi-Empirical Approach to Evaluating Model Fit for Sea Clutter Returns: Focusing on Future Measurements in the Adriatic Sea

Department of Communication and Space Technologies, Faculty of Electrical Engineering and Computing, University of Zagreb, 10000 Zagreb, Croatia
Current address: Software Development Center, emovis tehnologije d.o.o., 21000 Split, Croatia.
Entropy 2024, 26(12), 1069; https://doi.org/10.3390/e26121069
Submission received: 20 October 2024 / Revised: 24 November 2024 / Accepted: 5 December 2024 / Published: 9 December 2024
Figure 1
<p>Comparison of empirical and semi-empirical estimates of KL divergence. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 2
<p>Comparison of MSE of empirical and semi-empirical estimates of KL divergence. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 3
<p>Comparison of empirical and semi-empirical estimates. (<b>a</b>) SH distance estimation. (<b>b</b>) MSE of SH distance estimation.</p> ">
Figure 4
<p>Comparison of empirical and semi-empirical estimates of KL divergence using GP distribution as model and real sea clutter data. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 5
<p>Comparison of empirical and semi-empirical estimates of KL divergence using K distribution as model and real sea clutter data. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 6
<p>Comparison of variances of empirical and semi-empirical estimates of KL divergence using GP and K distribution as models and real sea clutter data. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 7
<p>Comparison of empirical and semi-empirical estimates of SH distance using GP and K distribution as models and real sea clutter data. (<b>a</b>) K distribution. (<b>b</b>) GP distribution.</p> ">
Figure 8
<p>Comparison of variances of empirical and semi-empirical estimates of SH distance using GP and K distributions as models and real sea clutter data.</p> ">
Figure 9
<p>Semi-empirical estimation of KL divergence between an empirical dataset following a unit-mean exponential distribution, <math display="inline"><semantics> <mrow> <mi>Exp</mi> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>, and a model distribution following a normal distribution, <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>3</mn> <mo>,</mo> <mn>4</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) Forward estimation. (<b>b</b>) Reverse estimation.</p> ">
Figure 10
<p>MSE of the KL divergence estimation between an empirical dataset following a unit-mean exponential distribution, <math display="inline"><semantics> <mrow> <mi>Exp</mi> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>, and a model distribution following a normal distribution, <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>3</mn> <mo>,</mo> <mn>4</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 11
<p>Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>3</mn> <mo>,</mo> <mn>4</mn> <mo>)</mo> </mrow> </semantics></math> and exponential model distribution <math display="inline"><semantics> <mrow> <mi>Exp</mi> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) SH distance estimation. (<b>b</b>) MSE of SH distance estimation.</p> ">
Figure 12
<p>Semi-empirical estimation of the KL divergence between two normal distributions, with the empirical dataset following <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> and the model distribution following <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>2</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) Forward estimation. (<b>b</b>) Reverse estimation.</p> ">
Figure 13
<p>MSE of KL divergence estimation between two normal distributions, empirical dataset following <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> and model distribution following <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>2</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) Forward. (<b>b</b>) Reverse.</p> ">
Figure 14
<p>Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> and normal model distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>2</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) SH distance estimation. (<b>b</b>) MSE of SH distance estimation.</p> ">
Figure 15
<p>Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> and normal model distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) SH distance estimation. (<b>b</b>) MSE of SH distance estimation.</p> ">
Figure 16
<p>Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> and normal model distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>2</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) SH distance estimation. (<b>b</b>) MSE of SH distance estimation.</p> ">
Figure 17
<p>Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>4</mn> <mo>)</mo> </mrow> </semantics></math> and normal model distribution <math display="inline"><semantics> <mrow> <mi mathvariant="script">N</mi> <mo>(</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>. (<b>a</b>) SH distance estimation. (<b>b</b>) MSE of SH distance estimation.</p> ">
Versions Notes

Abstract

:
A method for evaluating Kullback–Leibler (KL) divergence and Squared Hellinger (SH) distance between empirical data and a model distribution is proposed. This method exclusively utilises the empirical Cumulative Distribution Function (CDF) of the data and the CDF of the model, avoiding data processing such as histogram binning. The proposed method converges almost surely, with the proof based on the use of exponentially distributed waiting times. An example demonstrates convergence of the KL divergence and SH distance to their true values when utilising the Generalised Pareto (GP) distribution as empirical data and the K distribution as the model. Another example illustrates the goodness of fit of these (GP and K-distribution) models to real sea clutter data from the widely used Intelligent PIxel processing X-band (IPIX) measurements. The proposed method can be applied to assess the goodness of fit of various models (not limited to GP or K distribution) to clutter measurement data such as those from the Adriatic Sea. Distinctive features of this small and immature sea, like the presence of over 1300 islands that affect local wind and wave patterns, are likely to result in an amplitude distribution of sea clutter returns that differs from predictions of models designed for oceans or open seas. However, to the author’s knowledge, no data on this specific topic are currently available in the open literature, and such measurements have yet to be conducted.

1. Introduction

As discussed in [1], civilian coastline safety can be increased by monitoring gaps of inadequate coverage of the principle radars by employing lightweight Commercial Of The Shelf (COTS) radar sensors installed on mobile platforms, seeking for illegal vessels, presumably dim, manoeuvring and embedded in sea clutter. Due to the requirement for a small antenna profile, the operating frequency of these sensors lies in the X band and amplitude output is logarithmically rectified, resulting in the loss of phase information. A typical monitoring scenario involves tracking a highly manoeuvring Rigid Inflatable Boat (RIB), a dim target whose echo is significantly interfered with by sea clutter returns. During the process of algorithm performance evaluation, various tracking scenarios require the replication of clutter interference to a high degree of accuracy. Specifics of surveillance area, like the existence of islands and littoral environments, greatly affect local wind and wave characteristics. An example is the Croatian part of the Adriatic basin, which is a small and enclosed sea with more than 1300 islands, where surface wind waves are limited by fetch and wind duration, and depths cannot be neglected in most of the basin. For these reasons, the Adriatic Sea is an immature sea, with steeper waves than its counterpart in the ocean [2]. As, to the best of the author’s knowledge, measured clutter statistics of the Croatian part of the basin do not exist in the open literature, it is reasonable (to a certain extent) to expect clutter amplitude statistics that are different than those from models verified for the ocean, under the same wind conditions and particularly in littoral environments. Hence, in this specific context, replicating clutter interference necessitates modelling both the texture and speckle amplitude distributions, as well as the short- and long-term correlations. With this in mind, assessing the goodness of fit between the proposed models and the empirical data becomes crucial. However, through the rest of this paper, only clutter amplitude distribution is considered.
Goodness-of-fit measures are not always provided in the open literature in relation to sea clutter amplitude statistical fit. Some authors provide only visual goodness of fit, like in [3], where a proposed KK distribution is compared with a KA distribution in the tail region. The same dataset was used in [4] to extend the KK distribution with thermal noise, and a visual comparison with the K distribution was given. Other examples where only visual fit was presented are [5], where empirical amplitude distribution was fitted with the K distribution; ref. [6], where the authors fitted heavy-tailed empirical data with K, GP and Weibull distributions; ref. [7], where goodness of fit was given as the visual deviation of empirical data from the theoretical t distribution and an inverse Gaussian model in quantile–quantile plots; refs. [8,9], where the empirical amplitude distribution under the condition of short fetch was analysed and fitted with the K distribution; ref. [10], where empirical data were compared to Rayleigh, log-normal, K and Weibull distributions; and recently [11], where empirical data were modelled as a K distribution with noise, and the texture distribution was represented by gamma, inverse gamma, inverse Gaussian and log-normal distributions. Examples where a goodness-of-fit measure was provided by means of the chi-squared test are [12], where the fits of the K and Weibull distributions were tested, and [13], where Rayleigh and log-normal distributions were tested in addition to previously mentioned models. Other examples where goodness-of-fit measures were provided are [14], where an empirical cumulative distribution was used to perform Kolmogorov–Smirnov (KS) tests with the reference K distribution; ref. [15], where the goodness-of-fit measure for the proposed Compound Gaussian Inverse Gaussian (CGIG) texture model was given as the mean absolute quantile deviation; ref. [16], where the K-distribution fit was tested using a KS test and measure of fit was given as root mean square error; ref. [17], where the fit of Weibull, log-normal and Ricean Inverse Gaussian (RiIG) [18] distributions to the proposed CGIG distribution was measured using the Mean Square Error (MSE) criterion; and [19], where segmented sea clutter data in littoral environments were fitted with the K distribution and measure of fit was given as the absolute difference between the empirical and theoretical CDF. Recently, in [20], Bhattacharyya distance was used as a goodness-of-fit measure between empirical data and the K distribution with noise (including its variant with an extra Rayleigh component). The GP distribution with noise and the tri-modal discrete distribution were proposed in [21]. In [22], sea clutter data were fitted with Weibull, log-normal, generalised gamma, G 0 and α -stable distributions and comprehensive goodness-of-fit measures were provided, including KL divergence, Bhattacharyya distance and MSE, and finally, in [23], the authors fitted real sea clutter data with various distributions such as CGIG, GP and K distributions and goodness of fit was measured using KS distance and KL divergence, using both the empirical Probability Density Function (PDF) and the empirical CDF. Besides the papers referenced above, interested readers can find out more about K, KA, GP, Rayleigh, log-normal and Weibull distributions in [24,25,26,27,28], gamma and inverse gamma texture models in [29], G 0 distributions in [30] and α -stable distributions in [31].
It is interesting that, despite the extensive literature on clutter modelling, the author could not find any published goodness-of-fit tests that utilise the Wasserstein distance within the optimal transportation framework. However, ref. [32] provides a robust foundation for assessing goodness of fit between empirical data and a model in both univariate and multivariate cases, which involves solving the semi-discrete optimal transport problem. Additionally, ref. [33] establishes a Central Limit Theorem for semi-discrete distributions. These foundational contributions provide a strong basis for developing future goodness-of-fit tests for clutter modelling using the Wasserstein distance.
The contribution of this paper is the proposition of an estimator for KL divergence and SH distance that relies exclusively on the empirical and model CDFs, unlike the aforementioned methods that assess the goodness of fit using histogram binning of observed data. This work builds upon the methodology introduced in [34,35], where empirical CDFs of both distributions that were compared were employed in the empirical estimation of KL divergence and SH distance, respectively. Unlike empirical estimators, the estimator proposed in this work is semi-empirical and is restricted to univariate distributions. Although the empirical estimators proposed in [34,35] can be used to compare empirical data and models by generating samples from the former, the semi-empirical estimation of KL divergence and SH distance yields results with smaller variance.
This paper is organised as follows. The subsequent section provides preliminaries required for deriving the semi-empirical KL divergence and SH distance estimator. In Section 4, the semi-empirical estimation method derived in Section 3 is used to assess the goodness of fit for K and GP distributions when applied to empirical data from the commonly used IPIX dataset [36]. Additional numerical examples are provided to demonstrate that the proposed method is not limited to K and GP distributions only. These examples draw inspiration from those presented in [34,35]. Finally, a conclusion is given in Section 5.

2. Preliminaries

Suppose that empirical data are given as a totally ordered set of independent and identically distributed (i.i.d.) samples from an unknown univariate probability distribution p and are denoted as X = x i , i = 1 , , n . While the CDF of p is denoted as P, the empirical CDF of set X is denoted as P e and is given as a sum of Heaviside unit step piecewise constant functions [37].
θ ( x ) = 1 , x > 0 1 / 2 , x = 0 0 , x < 0
such that
P e ( x ) = 1 n i = 1 n θ ( x x i ) .
The univariate model, an approximation of P expressed in analytical form, has the PDF and CDF labelled as q and Q, respectively. Samples drawn from this distribution are represented as a totally ordered set X = { x j , j = 1 , , m } .
In [34], the discrete distribution (1) is linearised as
P c ( x ) = 0 , x < x 0 < inf { X , X } α i x + β i , x i 1 x < x i , i = 1 , , n 1 , x x n + 1 > sup { X , X }
which makes P c continuous. While α i and β i are chosen to match P c at sample points of P e , the values of x 0 and x n + 1 are not important.
Increments of the continuous empirical distribution P c and model distribution Q with respect to a sample x i X are defined as
δ P c ( x i ) = P c ( x i ) P c ( x i ϵ ) δ Q ( x i ) = Q ( x i ) Q ( x i ϵ ) ,
ϵ 0 , min i = 1 , , n { x i x i 1 } . For samples of increment Δ x i = x i x i 1 , the increment of the continuous empirical distribution is defined as
Δ P c ( x i ) = P c ( x i ) P c ( x i 1 )
and the increment of model distribution as
Δ Q ( x i ) = Q ( x i ) Q ( x i 1 ) .
Similarly, increments of continuous empirical and model distributions with respect to a sample x j X are defined as
δ P c ( x j ) = P c ( x j ) P c ( x j ϵ ) , δ Q ( x j ) = Q ( x j ) Q ( x j ϵ ) .
Furthermore, the definition of sample points in set X with reference to sample x j is
x n j + = min i = 1 , , n { x i : x i x j } x n j = max i = 1 , , n { x i : x i < x j }
which allows the definition of the sampling interval in set X with reference to the interval Δ x j = x j x j 1 as
Δ x n j = x n j + x n j .
Thus, increments of continuous empirical and model distributions with respect to samples of increment Δ x j are defined as
Δ P c ( x n j ) = P c ( x n j + ) P c ( x n j ) Δ Q ( x j ) = Q ( x j ) Q ( x j 1 ) .
In [38], a divergence between distributions P and Q is introduced as symmetrised Jeffreys distance J [39], employing corresponding densities p and q as
J = R p ( x ) q ( x ) log p ( x ) q ( x ) d x ,
where continuous distributions P and Q exist with respect to a Lebesgue measure. Divergence (3) can be understood as symmetric α -divergence family [40], interpreted here for univariate densities as
D ( α ) = 1 2 p ( x ) α q ( x ) α p ( x ) 1 α q ( x ) 1 α α ( 1 α ) d x , α R { 0 , 1 }
in the special limiting case where α 0 or α 1 such that lim α 0 D ( 0 ) = lim α 1 D ( 1 ) = J . Divergence (3) can be rewritten as a sum of more commonly used asymmetrical forward
D ( P | | Q ) = R p ( x ) log p ( x ) q ( x ) d x
and reverse
D ( Q | | P ) = R q ( x ) log q ( x ) p ( x ) d x
KL divergences as
J = R p ( x ) log p ( x ) q ( x ) d x + R q ( x ) log q ( x ) p ( x ) d x = D ( P | | Q ) + D ( Q | | P ) , D ( P | | Q ) , D ( Q | | P ) 0 ,
which are, throughout the remainder of this work, estimated using a semi-empirical method. For α = 1 / 2 , (4) simplifies to the SH distance as D ( 1 / 2 ) = 4 H 2 , where the SH distance
H 2 = 1 2 R p ( x ) q ( x ) 2 d x
between unknown distributions P and Q can be expressed with Hellinger affinity
A = 1 H 2 = R p ( x ) q ( x ) d x ,
utilising their corresponding densities p and q. Observe that the Hellinger affinity (8) shares the same definition as the Bhattacharyya coefficient, initially introduced in [41]. Furthermore, the value 1 A meets the triangle inequality [42], a property that KL divergence lacks. Both the Hellinger affinity and the Bhattacharyya coefficient are interpreted as indicators of similarity between two probability distributions.

3. Derivation of Semi-Empirical Estimator

In this section, a semi-empirical estimator for KL divergence and SH distance between the empirical and unknown distribution P and a model distribution Q are derived.

3.1. Semi-Empirical KL Divergence Estimator

Drawing upon the findings from [34], the estimation of forward KL divergence D ( P | | Q ) is proposed as
D ^ ( P | | Q ) = 1 n i = 1 n log δ P c ( x i ) δ Q ( x i ) ,
which is semi-empirical as P c is a continuous empirical distribution, given with (2), and Q is a model distribution in analytic form, with the approximation of P based on samples from set X . An analogous, semi-empirical estimation of reverse KL divergence D ( Q | | P ) is proposed as
D ^ ( Q | | P ) = 1 m j = 1 m log δ Q ( x j ) δ P c ( x j ) .
The corollary presented in the following section pertains to Theorem 1 outlined in [34] and is reproduced here for clarity of further reading.
Theorem 1
([34]). Let P and Q be absolutely continuous probability measures and assume its KL divergence is finite. Let X = { x i } i = 1 n and X = { x i } i = 1 m be i.i.d. samples sorted in increasing order, respectively, from P and Q; then,
D ^ ( P | | Q ) 1 a . s . D ( P | | Q ) .
In the proof of Theorem 1 in [34], the authors rearranged D ^ ( P | | Q ) as
D ^ ( P | | Q ) = 1 n i = 1 n log Δ P ( x i ) Δ x i Δ Q ( x m i ) Δ x m i 1 n i = 1 n log Δ P ( x i ) Δ P c ( x i ) + 1 n i = 1 n log Δ Q ( x m i ) Δ Q c ( x m i )
where Δ x m i = min { x j | x j x i } max { x j | x j < x i } and Δ Q c ( x m i ) = Q ( min { x j | x j x i } ) Q ( max { x j | x j < x i } ) . The authors demonstrated that the first term in Equation (11) converges almost surely to D ( p | | q ) , the second term converges almost surely to the negated Euler constant γ [43] (p. 905) and the third term converges almost surely to 1 γ . Combining these results, the authors concluded that D ^ ( P | | Q ) 1 a . s . D ( P | | Q ) .
Corollary 1.
Let Q denote the model CDF expressed in analytic form. The set of samples drawn from this distribution is denoted as X = { x j , j = 1 , , m } . Set X is obtained through the inverse transformation Q 1 ( U ) with U = { j / ( m + 1 ) , j = 1 , , m } as a totally ordered set. Then,
D ^ ( P | | Q ) γ a . s . D ( P | | Q )
and
D ^ ( Q | | P ) 1 + γ a . s . D ( Q | | P ) .
Proof of Corollary 1.
Given that U forms a totally ordered set, it follows that X also constitutes a totally ordered set. Each element u j U is the expected value of j-th order statistics in a sample size of m drawn from a uniform distribution over the open interval ( 0 , 1 ) [44] (p. 61). Thus, compared to the empirical method proposed in [34], stochastic realisations of set X are replaced with deterministic values, which reduces the estimation variance. Furthermore, as P c is continuous, (9) can be rewritten as in [34]:
D ^ ( P | | Q ) = 1 n i = 1 n log Δ P c ( x i ) Δ x i Δ Q ( x i ) Δ x i ,
and further reformulated as
D ^ ( P | | Q ) = 1 n i = 1 n log Δ P ( x i ) Δ x i Δ Q ( x i ) Δ x i 1 n i = 1 n log Δ P ( x i ) Δ P c ( x i ) .
The first term of the right-hand side converges almost surely to D ( P | | Q ) , and the second results in 1 n i = 1 n log n Δ P ( x i ) . As indicated in [34], if realisation P ( x i ) is thought of as a time event at x i in a Poisson point process, then Δ P ( x i ) = P ( x i ) P ( x i 1 ) is the time difference between two successive events, and quantity z i = n Δ P ( x i ) follows an exponential distribution f ( z ; 1 ) = exp ( z ) with a mean of 1. Thus,
1 n i = 1 n log z i a . s . R log ( z ) f ( z ; 1 ) d z = 0 log ( z ) exp ( z ) d z
where 0 log ( z ) exp ( z ) d z = γ [43] (p. 906) and (13) converges almost surely to D ( P | | Q ) + γ .
Semi-empirical estimation given with (10) can be further reformulated to
D ^ ( Q | | P ) = 1 m j = 1 m log Δ Q ( x j ) Δ x j Δ P ( x n j ) Δ x n j + 1 m j = 1 m log Δ P ( x n j ) Δ P c ( x n j ) .
The first term of the right-hand side converges almost surely to D ( Q | | P ) , and, as demonstrated in [34], the summation in the second term can be rephrased as the sum of samples from set X , incorporating a multiplication factor m Δ Q ( x i ) . This factor represents the number of samples from set X occurring between two consecutive samples from set X . Thus,
1 m j = 1 m log Δ P ( x n j ) Δ P c ( x n j ) = 1 m i = 1 n m Δ Q ( x i ) log Δ P ( x i ) Δ P c ( x i ) = 1 n i = 1 n Δ Q ( x i ) Δ x i Δ P ( x i ) Δ x i n Δ P ( x i ) log n Δ P ( x i ) .
As before, Δ P ( x i ) can be thought of as a time difference between two consecutive events at x i and x i 1 in a Poisson point process, so z i = n Δ P ( x i ) follows an exponential distribution f ( z ; 1 ) = exp ( z ) with a mean value of 1. Moreover, as Δ Q ( x i ) / Δ x i approaches q ( x ) and Δ P ( x i ) / Δ x i approaches p ( x ) ,
1 m j = 1 m log Δ P ( x n j ) Δ P c ( x n j ) a . s . R 2 p ( x ) q ( x ) p ( x ) z log ( z ) f ( z ; 1 ) d x d z = R q ( x ) d x 0 z log ( z ) exp ( z ) d z .
Since R q ( x ) d x = 1 and 0 z log ( z ) exp ( z ) d z results in 1 γ [43] (p. 573), (14) converges almost surely to D ( Q | | P ) + 1 γ . □

3.2. Semi-Parametric SH Distance Estimator

Building upon a finding from [35], the estimation of (8) involves employing a continuous empirical distribution P c and a model distribution Q expressed in analytic form, as proposed in the following manner:
A ^ ( P , Q ) = 1 n i = 1 n δ Q ( x i ) δ P c ( x i ) ,
which is a semi-empirical estimation, as argued in the previous subsection. To resemble the notation used for the estimation of KL divergence in the previous subsection, (16) is referred to as forward estimation. Given the symmetry of the SH distance, the reverse estimation of Hellinger affinity is proposed as
A ^ ( Q , P ) = 1 m j = 1 m δ P c ( x j ) δ Q ( x j ) ,
and a semi-empirical estimator of SH distance is denoted as
H 2 ^ = 1 A ^ , A ^ = 1 2 A ^ ( P , Q ) + A ^ ( Q , P )
which is similar to the approach taken by the authors in [35]. Throughout the remainder of this subsection, it will be demonstrated that semi-empirical estimates (16) and (17) converge almost surely to the true value of A. Therefore, the following corollary is derived from Theorem 1 [34], as well as insights from [35].
Corollary 2.
Let Q denote the model CDF expressed in analytic form. The set of samples drawn from this distribution is denoted as X = { x j , j = 1 , , m } . Set X is obtained through the inverse transformation Q 1 ( U ) with U = { j / ( m + 1 ) , j = 1 , , m } as a totally ordered set. Then,
2 π A ^ ( P , Q ) a . s . A , 2 π A ^ ( Q , P ) a . s . A ,
and consequently,
4 π A ^ a . s . A .
Proof Corollary 2.
Applying the identical reasoning as presented in the proof of Corollary 1, (16) is rewritten as
A ^ ( P , Q ) = 1 n i = 1 n Δ Q ( x i ) Δ x i Δ P ( x i ) Δ x i Δ P ( x i ) Δ P c ( x i ) = 1 n i = 1 n Δ Q ( x i ) Δ x i Δ P ( x i ) Δ x i z ,
where z = n Δ P ( x i ) follows a unit-mean exponential distribution f ( z ; 1 ) = exp ( z ) . Therefore, as Δ Q ( x i ) / Δ x i tends to q ( x ) and Δ P ( x i ) / Δ x i tends to p ( x ) , it follows that
A ^ ( P , Q ) a . s . R 2 p ( x ) q ( x ) p ( x ) z f ( z ; 1 ) d x d z = R p ( x ) q ( x ) d x 0 z exp ( z ) d z = π 2 A ,
since 0 z exp ( z ) d z = Γ 3 / 2 = π / 2 [43] (p. 897) and R p ( x ) q ( x ) d x = A with the help of the Hellinger affinity definition (8).
The reverse estimation of Hellinger affinity (17) can be reformulated as
A ^ ( Q , P ) = 1 m j = 1 m Δ P ( x n j ) Δ x n j Δ Q ( x j ) Δ x j Δ P c ( x n j ) Δ P ( x n j ) = 1 m j = 1 m Δ P ( x n j ) Δ x n j Δ Q ( x j ) Δ x j 1 n Δ P ( x n j ) ,
and considering that n Δ P ( x n j ) is independent from the distribution P,
A ^ ( Q , P ) a . s . 1 m j = 1 m Δ P ( x n j ) Δ x n j Δ Q ( x j ) Δ x j 1 m j = 1 m 1 n Δ P ( x n j ) .
While the first sum of the right-hand side converges almost surely to A when the ratios Δ P ( x n j ) / Δ x n j and Δ Q ( x j ) / Δ x j tend to p ( x ) and q ( x ) , respectively, the second sum can be restated by utilising samples from the set X and incorporating the factor m Δ Q ( x i ) . Here, as previously mentioned in Section 3.1 and illustrated in (15), m Δ Q ( x i ) denotes the count of samples from set X between two consecutive samples from set X . Thus,
1 m j = 1 m 1 n Δ P ( x n j ) = 1 m i = 1 n m Δ Q ( x i ) n Δ P ( x i ) = 1 n i = 1 n Δ Q ( x i ) Δ x i Δ P ( x i ) Δ x i n Δ P ( x i ) n Δ P ( x i ) .
From the findings outlined in Section 3.1, it is evident that the variable z i = n Δ P ( x i ) follows a unit-mean exponential distribution f ( z ; 1 ) = exp ( z ) , so from (19), it follows that
1 m j = 1 m 1 n Δ P ( x n j ) a . s . R 2 p ( x ) q ( x ) p ( x ) z f ( z ; 1 ) d x d z = R q ( x ) d x 0 z exp ( z ) d z = π 2 ,
since R q ( x ) d x is 1 and 0 z exp ( z ) d z = Γ 3 / 2 = π / 2 . Therefore, since 2 A ^ ( P , Q ) / π and 2 A ^ ( Q , P ) / π converge almost surely to A, from (18), it follows that 4 A ^ / π also converges almost surely to A. □

4. Numerical Examples

The following subsection focuses on evaluating the goodness of fit for some commonly used radar sea clutter models. The subsequent subsection explores additional examples unrelated to sea clutter models, highlighting instances where convergence limitations may be observed, but also presenting an instance where convergence is reached with a moderate number of samples.

4.1. Radar Sea Clutter

Within this subsection, two illustrative examples showing the performance of the proposed semi-empirical estimators are given. The presented results are compared with those derived from empirical estimators proposed in [34,35]. Both the semi-empirical KL divergences (9) and (10), along with the SH distance (18), are included. The presented examples cover a scenario where the true distribution is assumed to be known, as well as a situation where it remains unknown. Examples draw inspiration from [45], where the authors chose to fit specific datasets from a measurement campaign [36] without specifying which datasets were used. Upon examining the available files from [36], it was found that datasets 17 and 31 were utilised. In both cases, cell number 14 was tested. As a remark, selected datasets are particularly interesting as they share some properties found in COTS radar sensors and correspond to significant wave heights of 2.3 m (dataset 17) and 0.9 m (dataset 31) which represent two-thirds of the total wave heights observed in the Adriatic Sea [46]. The fitting involved K and GP distributions as models. A brief overview of these distributions is provided in Appendix A.
In the first example from [45], a heavy-tailed distribution is observed and the GP distribution provides a good fit to the data. Therefore, it is treated as the known distribution, and KL divergences and the SH distance between this distribution and the K distribution as a model are estimated. However, in the second example from [45], a distribution with a moderate tail is observed and it remains uncertain which distribution better captures the data, K or GP. Consequently, KL divergences and SH distance are estimated using actual data, with the K and GP distributions as models.
To evaluate the KL divergence and SH distance using data from the first example in [45], the model distribution P is defined as
P ( x ) = c p 1 4 ν p β p x 2 + 4 ν p , ν p , β p > 0 , 0 x x m
for which clutter amplitude intensity z = x 2 is GP distributed. Truncated distribution (20) has shape and scale parameters denoted as ν p = 1.02 and β p = 49678.19 V 2 , respectively. Constant c p depends on the maximum observed amplitude value x m as
c p 1 = 1 4 ν p β p x m 2 + 4 ν p
which is, from the results presented in [45], approximately 0.35 V . Likewise, the model distribution Q is defined as a truncated K-distribution CDF
Q ( x ) = c q 1 2 x ν q β q ν q Γ ( ν q ) K ν q 2 x β q , ν q , β q > 0 , 0 x x m ,
with the shape parameter set to ν q = 0.38 and the scale parameter set to β q = 0.00109 V 2 , for 0 x x m . In (21), K denotes a modified Bessel function of the second kind [47], and the constant c q also depends on x m as
c q 1 = 1 2 x m ν q β q ν q Γ ( ν q ) K ν q 2 x m β q .
Truncating the distributions (20) and (21) prevents numerical underflow in the denominator of (9). This issue arises particularly with high-amplitude values ( x x m ), which possess a notable probability of occurrence due to the heavy-tailed nature of (20). In other words, when the numerator in Equation (9) follows a heavy-tailed distribution (e.g., GP distribution) and the denominator follows a moderate-tailed distribution (e.g., K distribution), large sample amplitudes from the heavy-tailed empirical data can cause the denominator to reduce to zero, resulting in numerical underflow. Truncating the heavy-tailed distribution helps prevent such occurrences.
The numerical experiment is organised as follows. In order to reproduce the empirical estimation proposed in [34,35], two equally sized sets are generated from densities p and q, corresponding to distributions P and Q, respectively. Specifically, X = { x i p ( ν p , β p , x m ) , i = 1 , , n } and X = { x j q ( ν q , β q , x m ) , j = 1 , , m = n } . Their respective empirical CDFs are then used as inputs for the empirical estimators. For the semi-empirical estimation, set X is kept as is and set X is constructed with the help of inverse transformation X = Q 1 ( U ) . Here, the set U = { j / ( m + 1 ) , j = 1 , , m } is a totally ordered set as described in the proof of Corollary 1, and Q is defined by (21). Generated samples X and X were subjected to the KS test to determine whether they followed truncated K (A5) and GP (A6) distributions. The KS test [48] validated the null hypothesis H 0 that the samples are drawn from truncated K and GP distributions, indicating finite KS distances. Additionally, the Ljung–Box test for independence [49] was performed, and the null hypothesis H 0 that the samples are independently distributed was also accepted. The Monte Carlo method for simulating samples X and X is described in greater detail in Appendix B and Table A1.
The comparison between the KL divergence estimates across 100 trials using the empirical method from [34] and the semi-empirical methods (9) and (10) is illustrated in Figure 1. Although the semi-empirical KL divergence estimation demonstrates lower MSE, as highlighted in Figure 2, it exhibits a slower convergence to the true value for the specific distributions considered in this example. The MSE is defined for forward estimation as
MSE D ^ ( P | | Q ) = 1 n i = 1 n D ^ i ( P | | Q ) D ( P | | Q ) 2
and for reverse estimation as
MSE D ^ ( Q | | P ) = 1 n i = 1 n D ^ i ( Q | | P ) D ( Q | | P ) 2
where the subscript i represents the i-th realisation.
The findings of the SH distance estimation, utilising both the empirical method from [35] and the semi-empirical method (18) across 100 trials, are presented in Figure 3a. The MSE corresponding to these estimations is presented in Figure 3b, and, as expected, the MSE of the semi-empirical estimate is lower. Notably, for the distributions examined in this example, the semi-empirical estimation exhibits a faster convergence towards the true value. The MSE for SH distance is defined as
MSE H 2 ^ = 1 n i = 1 n H 2 ^ i H 2 2
where subscript i stands for the i-th realisation.
For the second example from [45], amplitude measurements in dataset 31 are permuted to meet the condition of independent samples required by Theorem 1. Although this does not guarantee sample independence, independence will be assumed for the purposes of this example. Furthermore, amplitude in the utilised dataset was quantised in 256 levels, which makes the quantisation effect significant and, as a consequence, the condition of continuous distribution required by Theorem 1 is not met. Therefore, band-limited noise that follows the density
w ( x ) = 1 c w sin π x 4 Δ x π x 4 Δ x 4 , c w = sin π x 4 Δ x π x 4 Δ x 4 d x
(where Δ x is the quantisation interval) was added to the amplitude samples to reconstruct the original continuous signal data [50]. Since the original data before sampling were likely not dithered, this introduces some distortion of the original density. However, visual inspection of the histogram before and after noise addition shows no significant deviation. Consequently, for the purpose of this example, the distortion is considered negligible.
Considering the aforementioned points, the numerical example is organised as follows. The dataset 13 from [36] is divided into k groups, each containing n samples, ensuring that the product k n equals 131,072 , which is the total number of amplitude samples in the dataset. Each group is treated as a separate trial. The amplitude distribution of the samples is fitted using the GP distribution and the K distribution, with the parameters taken from [45]. For the GP distribution, the parameters are ν p = 1.53 and β p = 1067.66 V 2 , while for the K distribution, the parameters are ν q = 1 and β q = 0.00620 V 2 . Real data involved in this example do not exhibit the high-level excursions observed in the simulated data. Therefore, the truncation of the distributions (20) and (21) is set to the maximum observed amplitude x m = 0.7 V .
Figure 4 presents the results of both empirical and semi-empirical estimations of forward and reverse KL divergence for the GP distribution. Similarly, Figure 5 displays these results for the K distribution. The convergence value meets expectations, as the KL divergence between these models is 0.032 for the forward divergence and 0.03 for the reverse divergence. Given that the data in the tail region lie between values of these two models, it is reasonable to expect KL divergences to be lower than the mentioned values. Furthermore, as shown in Figure 6, although the empirical estimation converges faster, it exhibits greater estimation variance compared to the semi-empirical estimation. This observation holds true for both the GP and the K-distribution models. The difference in estimation variance between the GP and K-distribution models is not significant.
Regarding the SH distance estimate, Figure 6 and Figure 7 demonstrate faster convergence with a smaller variance, Figure 8, using the semi-empirical method. Given that the theoretical SH distance between models (20) and (21) is 0.0075 , the observed convergence value aligns with expectations. Since the data in the tail region fall within the tails of the distributions used, it is reasonable to expect an SH distance smaller than this value.

4.2. Additional Numerical Examples

This subsection provides additional numerical examples illustrating the semi-empirical estimation of KL divergence and SH distance, demonstrating that the proposed method works well for distributions not necessarily related to sea clutter. These examples are inspired by the numerical experiments in [34,35]. In the first part of these numerical experiments, KL divergence and SH distance are evaluated for distributions with different supports. Specifically, the KL divergence is computed between samples from a unit-mean exponential distribution, X = { x i : x i Exp ( 1 ) , i = 1 , , n } , with support x [ 0 , ) , and a model represented by a normal distribution N ( 3 , 4 ) with support x R . Forward estimation results for KL divergence are depicted in Figure 9a, while reverse estimation results are shown in Figure 9b. Notably, reverse estimation is significantly slower than forward estimation in this case, with higher MSE. A similar trend is observed for sea clutter distributions in Figure 1, where the set X is sampled from a heavy-tailed GP distribution, and the model distribution is a moderately tailed K distribution. The MSE for both forward and reverse estimations is presented in Figure 10.
Unlike KL divergence estimation, SH distance estimation for distributions with non-equal support demonstrates greater robustness. This is illustrated in Figure 11a, where the samples are drawn from normal distribution X = { x i : x i N ( 3 , 2 ) } and the model distribution is a unit-mean exponential. A similar pattern is observed for SH distance estimation between samples drawn from a heavy-tailed GP distribution and a model given by a K distribution, as shown in Figure 3a. The MSE for SH distance estimation is shown in Figure 11b.
The remainder of this subsection focuses on comparing divergences and distances between normal distributions with different parameters. Thus, Figure 12 and Figure 13 depict the KL divergence between samples drawn from zero-mean normal distributions, where the empirical distribution has a variance of 1, and the model distribution has a variance of 2. The corresponding MSE is shown in Figure 13. Notice the significantly slower convergence in the reverse estimation, leading to a higher MSE. For comparison, the SH distance between the same datasets is presented in Figure 14a, with the associated MSE provided in Figure 14b.
Next, SH distances and their corresponding MSEs are compared for samples drawn from the normal distribution N ( 0 , 1 ) against model distributions N ( 1 , 1 ) and N ( 2 , 1 ) . The results are shown in Figure 15 and Figure 16, respectively.
Finally, the SH distance is estimated for samples drawn from the normal distribution N ( 0 , 4 ) and a model following the normal distribution N ( 1 , 1 ) . The results are presented in Figure 17a for the estimation and Figure 17b for the MSE.

5. Conclusions

The suggested approach for evaluating the goodness of fit of model distributions to clutter data is part of a broader framework focused on replicating sea clutter data obtained from measurements. Such a replication is valuable for evaluating the performance of various processes, such as target-tracking algorithms, or other types of processing aimed to mitigate the effects of sea clutter returns. The results demonstrated in the examples for the representative K and GP distributions indicate that a sample size greater than 1000 is sufficient for the semi-empirical estimate to converge to the true value. Additional examples demonstrate that the proposed method is not limited to compound distributions like K and GP. In most cases, the method performs effectively, and compared to its empirical counterpart, it achieves a lower MSE because randomness originates solely from a single source, the empirical dataset. However, in certain scenarios, particularly when the supports of the sample and model distributions differ, reverse KL divergence estimation may require an extremely large sample size, exceeding 100,000, to converge. In such cases, the MSE is significantly higher compared to forward estimation.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the study are openly available in the McMaster IPIX radar sea clutter database at http://soma.ece.mcmaster.ca/ipix/dartmouth/datasets.html (accessed on 27 September 2023).

Acknowledgments

I would like to express my gratitude to Abertis Mobility Services for their generous support in covering the journal publication fees for this article.

Conflicts of Interest

Author Bojan Vondra was employed by the company emovis tehnologije d.o.o., Split, Croatia and declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CDFCumulative Distribution Function
CGIGCompound Gaussian Inverse Gaussian
COTSCommercial Of the Shelf
GPGeneralised Pareto
i.i.d.independent and identically distributed
IPIXIntelligent PIxel processing X-band
KLKullback-Leibler
KSKolmogorov-Smirnov
MSEMean Square Error
PDFProbability Distribution Function
RIBRigid Inflatable Boat
RiIGRicean Inverse Gaussian
SHSquared Hellinger

Appendix A

Both K and GP distributions belong to the class of complex and elliptically symmetric distributions [7] and can be observed in the context of the surface wave model of the sea, primarily composed of two components: large gravitational and smaller capillary waves, as described in [51]. Consequently, the complex envelope of clutter returns can be written as d = x ζ [52], where x denotes a speckle component that is the result of reflections from a large number of uniformly distributed reflectors within the sensor’s resolution cell, resulting in a complex Gaussian process with a temporal correlation of several milliseconds, and ζ is due to the reflections of electromagnetic energy from large gravitational (wind-driven and swell) waves and it describes the background power with a temporal correlation of several seconds. If the variances of the in-phase and quadrature-phase components of x are normalised to 1 as in [45] and ζ is gamma distributed with shape and scale parameters ν q and β q , defined as
p ( ζ ) = ζ ν q 1 β q 2 ν q Γ ( ν q ) exp 2 ζ β q , ζ , ν q , β q > 0 ,
then the amplitude distribution q ( x = | d | ) follows a K distribution
q ( x = | d | ) = 4 x ν q β q ν q + 1 2 Γ ( ν q ) K ν q 1 2 x β q , x 0
where Γ is the gamma function and K denotes the modified Bessel function of the second kind [47]. Moreover, if ζ is distributed as inverse gamma with shape and scale parameters ν p and β p , given by
p ( ζ ) = ζ ν p 1 β p 2 ν p Γ ( ν p ) exp 2 β p ζ , ζ , ν p , β p > 0 ,
the amplitude distribution p ( x = | d | ) is given by
p ( x = | d | ) = x β p Γ ( ν p + 1 ) 2 β p 4 x 2 + 1 ν p + 1 Γ ( ν p ) , x 0
which corresponds to a GP distribution for the amplitude intensity z = x 2 . Truncated versions of (A2) and (A4) are given as
q ( x = | d | ) = 1 2 x m ν q β q ν q Γ ( ν q ) K ν q 2 x m β q 1 4 x ν q β q ν q + 1 2 Γ ( ν q ) K ν q 1 2 x β q , 0 x x m
for the K distribution and as
p ( x = | d | ) = 1 4 ν p β p x m 2 + 4 ν p 1 x β p Γ ( ν p + 1 ) 2 β p 4 x 2 + 1 ν p + 1 Γ ( ν p ) , 0 x x m
for the GP distribution.

Appendix B

Samples from truncated K (A5) and GP (A6) distributions with shape and scale parameters ν p , ν q , and β p , β q , respectively, as well as with maximum value x m , are generated based on the compound nature of these densities. Specifically, speckle x follows normalised complex Gaussian noise and if the background (or texture) ζ follows an inverse gamma distribution (A3), the compound distribution d = | x | ζ follows a GP distribution. If the texture follows a gamma distribution (A1), d follows a K distribution. Leveraging this approach, a naive accept–reject method [53] for generating samples from truncated K and GP distributions is outlined in Table A1. The random number generators for gamma and normal distributions, among others, are provided as built-in functions in the Scilab computation engine [54], which is utilised throughout the section on numerical examples.
Table A1. Algorithm for generating n samples from truncated K and GP distributions.
Table A1. Algorithm for generating n samples from truncated K and GP distributions.
StepDescription
1 i = 1
2If distribution is GP, then generate sample ζ i from gamma distribution (A1) with shape and scale parameters ν p , β p , else, generate sample ζ i from gamma distribution with shape and scale parameters ν q , β q
3If distribution is GP, then ζ i = 1 / ζ i , else, continue
4Generate in-phase sample x I i from zero-mean Gaussian distribution with variance ζ i , i.e., x I i N ( 0 , ζ i )
5Generate quadrature-phase sample x Q i from zero-mean Gaussian distribution with variance ζ i , i.e., x Q i N ( 0 , ζ i )
6Amplitude of compound distribution is d i = | x I i + i x Q i |
7If d i x m , then i = i + 1 , else, go to Step 2
8If i n , go to Step 2, else, stop
This approach, although functional, is relatively inefficient. For both the K and GP distributions, generating 100,000 samples that follow truncated K or GP distributions requires approximately seven times more computational effort compared to the non-truncated case.

References

  1. Vondra, B.; Bonefačić, D.; Mišković, T. Employment of Semi-Parametric Radar Sea Clutter Model in Civilian Surveillance of Croatian Part of Adriatic Sea. In Annual of the Croatian Academy of Engineering; Andročec, V., Mrša, V., Rogale, D., Eds.; Akademija Tehničkih Znanosti Hrvatske (HATZ): Kačićeva, Zagreb, 2021; Volume 2020, pp. 42–56. [Google Scholar]
  2. Katalinić, M.; Ćorak, M.; Parunov, J. Analysis of Wave Heights and Wind Speeds in the Adriatic Sea. In Maritime Technology and Engineering; Chapter Analysis of Wave Heights and Wind Speeds in the Adriatic Sea; Soares, C.G., Santos, T.A., Eds.; Taylor & Francis Group: London, UK, 2015; pp. 1389–1394. [Google Scholar]
  3. Dong, Y. Distribution of X-Band High Resolution and High Grazing Angle Sea Clutter; Technical Report; Defence Science and Technology Organisation: Edinburgh, Australia, 2006.
  4. Rosenberg, L.; Crisp, D.J.; Stacy, N.J. Analysis of the KK-distribution with X-band Medium Grazing Angle Sea-Clutter. IET Radar Sonar Navig. 2010, 4, 209–222. [Google Scholar] [CrossRef]
  5. Nohara, T.J.; Haykin, S. Canadian East Coast Radar Trials and the K-distribution. IEE Proc. F-Radar Signal Process. 1991, 138, 80–88. [Google Scholar] [CrossRef]
  6. Sangston, K.J.; Gini, F.; Greco, M.S. Coherent Radar Target Detection in Heavy-Tailed Compound-Gaussian Clutter. IEEE Trans. Aerosp. Electron. Syst. 2012, 48, 64–77. [Google Scholar] [CrossRef]
  7. Ollila, E.; Tyler, D.E.; Koivunen, V.; Poor, H.V. Complex Elliptically Symmetric Distributions: Survey, New Results and Applications. IEEE Trans. Signal Process. 2012, 60, 5597–5625. [Google Scholar] [CrossRef]
  8. Johnsen, T. Characterization of X-band Radar Sea-Clutter in a Limited Fetch Condition from Low to High Grazing Angles. In Proceedings of the IEEE Radar Conference, Johannesburg, South Africa, 27–30 October 2015; pp. 109–114. [Google Scholar]
  9. Johnsen, T.; Ødegaard, N.; Knapskog, A.O. X-Band Radar Sea-Clutter Measurements from Low to Medium Grazing Angles Recorded from a Helicopter Platform; Meeting Proceedings STO-MP-SET-239; Science and Technology Organisation: Neuilly-sur-Seine, France, 2016. [Google Scholar]
  10. Song, C.; Xiuwen, L. Statistical Analysis of X-band Sea Clutter at Low Grazing Angles. In Proceedings of the 2020 International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), Bangkok, Thailand, 30 October–1 November 2020; pp. 141–144. [Google Scholar] [CrossRef]
  11. Bocquet, S. Analysis and Simulation of Low Grazing Angle X-Band Coherent Radar Sea Clutter Using Memoryless Nonlinear Transformations. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
  12. Bouvier, C.; Martinet, L.; Favier, G.; Artaud, M. Simulation of Radar Sea Clutter Using Autoregressive Modelling and K-Distribution. In Proceedings of the IEEE International Radar Conference, Alexandria, VA, USA, 8–11 May 1995; pp. 425–430. [Google Scholar]
  13. Chan, H.C. Radar Sea-Clutter at Low Grazing Angles. IEE Proc. F-Radar Signal Process. 1990, 137, 102–112. [Google Scholar] [CrossRef]
  14. Crisp, D.J.; Rosenberg, L.; Stacy, N.J.; Dong, Y. Modelling X-band Sea Clutter with the K-distribution: Shape Parameter Variation. In Proceedings of the 2009 International Radar Conference “Surveillance for a Safer World”, Bordeaux, France, 12–26 October 2009; pp. 1–6. [Google Scholar]
  15. Ollila, E.; Tyler, D.E.; Koivunen, V.; Poor, H.V. Compound-Gaussian Clutter Modeling with an Inverse Gaussian Texture Distribution. IEEE Signal Process. Lett. 2012, 19, 876–879. [Google Scholar] [CrossRef]
  16. Mandal, S.K.; Bhattacharya, C. Validation of Stochastic Properties of High Resolution Clutter Data from IPIX Radar Data. In Proceedings of the 2013 International Conference on Intelligent Systems and Signal Processing (ISSP), Vallabh Vidyanagar, India, 1–2 March 2013; pp. 251–255. [Google Scholar] [CrossRef]
  17. Mezache, A.; Soltani, F.; Sahed, M.; Chalabi, I. Model for Non-Rayleigh Clutter Amplitudes Using Compound Inverse Gaussian Distribution: An Experimental Analysis. IEEE Trans. Aerosp. Electron. Syst. 2015, 51, 142–153. [Google Scholar] [CrossRef]
  18. Eltoft, T. The Rician Inverse Gaussian Distribution: A New Model for Non-Rayleigh Signal Amplitude Statistics. IEEE Trans. Image Process. 2005, 14, 1722–1735. [Google Scholar] [CrossRef]
  19. Strempel, M.D.; Villiers, J.P.D.; Cilliers, J.E.; McDonald, A. Distribution Analysis of Segmented Wave Sea Clutter in Littoral Environments. In Proceedings of the 2015 IEEE Radar Conference, Johannesburg, South Africa, 27–30 October 2015; pp. 133–138. [Google Scholar]
  20. Angelliaume, S.; Rosenberg, L.; Ritchie, M. Modeling the Amplitude Distribution of Radar Sea Clutter. Remote Sens. 2019, 11, 319. [Google Scholar] [CrossRef]
  21. Gierull, C.H.; Sikaneta, I. A Compound-plus-Noise Model for Improved Vessel Detection in Non-Gaussian SAR Imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1444–1453. [Google Scholar] [CrossRef]
  22. Cao, C.; Zhang, J.; Zhang, X.; Gao, G.; Zhang, Y.; Meng, J.; Liu, G.; Zhang, Z.; Han, Q.; Jia, Y.; et al. Modeling and Parameter Representation of Sea Clutter Amplitude at Different Grazing Angles. IEEE J. Miniaturiz. Air Space Syst. 2022, 3, 284–293. [Google Scholar] [CrossRef]
  23. Xia, X.Y.; Shui, P.L.; Zhang, Y.S.; Li, X.; Xu, X.Y. An Empirical Model of Shape Parameter of Sea Clutter Based on X-Band Island-Based Radar Database. IEEE Geosci. Remote Sens. Lett. 2023, 20, 3503205. [Google Scholar] [CrossRef]
  24. Jakeman, E.; Pusey, P.N. A Model for Non-Rayleigh Sea Echo. IEEE Trans. Antennas Propag. 1976, 24, 806–814. [Google Scholar] [CrossRef]
  25. Middleton, D. New Physical-Statistical Methods and Models for Clutter and Reverberation: The KA-Distribution and Related Probability Structures. IEEE J. Ocean. Eng. 1999, 24, 261–284. [Google Scholar] [CrossRef]
  26. Farshchian, M.; Posner, F.L. The Pareto Distribution for Low Grazing Angle and High Resolution X-band Sea Clutter. In Proceedings of the 2010 IEEE Radar Conference, Arlington, VA, USA, 10–14 May 2010; pp. 789–793. [Google Scholar]
  27. Papoulis, A. Probability, Random Variables and Stochastic Processes; McGraw-Hill, Inc.: New York, NY, USA, 1991. [Google Scholar]
  28. Rosenberg, L.; Watts, S.; Bocquet, S. Application of the K + Rayleigh Distribution to High Grazing Angle Sea-Clutter. In Proceedings of the 2014 International Radar Conference, Lille, France, 13–17 October 2014; pp. 1–6. [Google Scholar] [CrossRef]
  29. Rosenberg, L.; Watts, S.; Greco, M.S. Modeling the Statistics of Microwave Radar Sea Clutter. IEEE Aerosp. Electron. Syst. Mag. 2019, 34, 44–75. [Google Scholar] [CrossRef]
  30. Gao, G. Statistical Modeling of SAR Images: A Survey. Sensors 2010, 10, 775–795. [Google Scholar] [CrossRef]
  31. Pierce, R.D. RCS Characterization Using the Alpha-Stable Distribution. In Proceedings of the 1996 IEEE National Radar Conference, Ann Arbor, MI, USA, 13–16 May 1996; pp. 154–159. [Google Scholar]
  32. Hallin, M.; Mordant, G.; Segers, J. Multivariate Goodness-of-Fit Tests Based on Wasserstein Distance. Electron. J. Stat. 2021, 15, 1328–1371. [Google Scholar] [CrossRef]
  33. del Barrio, E.; Sanz, A.G.; Loubes, J.M. Central Limit Theorems for Semi-Discrete Wasserstein Distances. Bernoulli 2024, 30, 554–580. [Google Scholar] [CrossRef]
  34. Perez-Cruz, F. Kullback-Leibler Divergence Estimation of Continuous Distributions. In Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, ON, Canada, 6–11 July 2008; pp. 1666–1670. [Google Scholar] [CrossRef]
  35. Ding, R.; Mullhaupt, A. Empirical Squared Hellinger Distance Estimator and Generalizations to a Family of α-Divergence Estimators. Entropy 2023, 25, 612. [Google Scholar] [CrossRef]
  36. Bakker, R.; Currie, B. The Dartmouth Database. Available online: http://soma.ece.mcmaster.ca/ipix/dartmouth/index.html (accessed on 27 September 2023).
  37. Laplace Transforms. In Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Ninth Dover Printing, Tenth GPO Printing ed.; Abramowitz, M.; Stegun, I.A. (Eds.) Dover: New York, NY, USA, 1964; pp. 1020–1030. [Google Scholar]
  38. Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  39. Jeffreys, H. Theory of Probability; International Series of Monographs on Physics, Clarendon Press: Oxford, UK, 1983. [Google Scholar]
  40. Cichocki, A.; Amari, S.I. Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities. Entropy 2010, 12, 1532–1568. [Google Scholar] [CrossRef]
  41. Bhattacharyya, A.K. On a Measure of Divergence between Two Statistical Populations Defined by Their Probability Distributions. Calcutta Math. Soc. 1943, 35, 99–109. [Google Scholar]
  42. Kailath, T. The Divergence and Bhattacharyya Distance Measures in Signal Selection. IEEE Trans. Commun. 1967, 15, 52–60. [Google Scholar] [CrossRef]
  43. Gradstheyn, I.; Ryzhik, I.M.; Jeffrey, A.; Zwillinger, D.; Gradštejn, I. Table of Integrals, Series and Products, 7th ed.; Elsevier Academic Press: Amsterdam, The Netherlands, 2009. [Google Scholar]
  44. Gentle, J. Computational Statistics, 1st ed.; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
  45. Vondra, B.; Bonefačić, D. Mitigation of the Effects of Unknown Sea Clutter Statistics by Using Radial Basis Function Network. Radioengineering 2020, 29, 215–227. [Google Scholar] [CrossRef]
  46. Katalinić, M.; Parunov, J. Review of Climatic Conditions in the Adriatic Sea. In Proceedings of the 21st Symposium on Theory and Practice of Shipbuilding, In Memoriam Prof. Leopold Sorta (Sorta 2014), Krk, Hrvatska, 2–4 October 2014; pp. 555–562. [Google Scholar]
  47. Glasser, L.; Kohl, K.; Koutschan, C.; Moll, V.H.; Straub, A. The Integrals in Gradshteyn and Ryzhik. Part 22: Bessel-K functions. Sci. Ser. A Math. Sci. 2012, 22, 129–151. [Google Scholar]
  48. Conover, W.J. Practical Nonparametric Statistics, 3rd ed.; Number Applied Probability and Statistics Section in Wiley Series on Probability and Statistics; John Wiley & Sons, Inc.: New York, NY, USA, 1999. [Google Scholar]
  49. Ljung, G.; Box, G. On a Measure of Lack of Fit in Time Series Models. Biometrika 1978, 65, 297–303. [Google Scholar] [CrossRef]
  50. Gustafsson, F.; Karlsson, R. Generating Dithering Noise for Maximum Likelihood Estimation from Quantized Data. Automatica 2013, 49, 554–560. [Google Scholar] [CrossRef]
  51. Haykin, S.; Bakker, R.; Currie, B.W. Uncovering Nonlinear Dynamics—The Case Study of Sea Clutter. Proc. IEEE 2002, 90, 860–881. [Google Scholar] [CrossRef]
  52. Gini, F.; Greco, M. Texture Modelling, Estimation and Validation Using Measured Sea Clutter Data. IEE Proc.-Radar Sonar Navig. 2002, 149, 115–124. [Google Scholar] [CrossRef]
  53. Robert, C.P.; Casella, G. Monte Carlo Statistical Methods; Springer: New York, NY, USA, 2004. [Google Scholar]
  54. Ramchandran, H.; Nair, A.S. SCILAB (a Free Software to MATLAB); S. Chand & Company Ltd.: Ram Nagar, New Delhi, India, 2012. [Google Scholar]
Figure 1. Comparison of empirical and semi-empirical estimates of KL divergence. (a) Forward. (b) Reverse.
Figure 1. Comparison of empirical and semi-empirical estimates of KL divergence. (a) Forward. (b) Reverse.
Entropy 26 01069 g001
Figure 2. Comparison of MSE of empirical and semi-empirical estimates of KL divergence. (a) Forward. (b) Reverse.
Figure 2. Comparison of MSE of empirical and semi-empirical estimates of KL divergence. (a) Forward. (b) Reverse.
Entropy 26 01069 g002
Figure 3. Comparison of empirical and semi-empirical estimates. (a) SH distance estimation. (b) MSE of SH distance estimation.
Figure 3. Comparison of empirical and semi-empirical estimates. (a) SH distance estimation. (b) MSE of SH distance estimation.
Entropy 26 01069 g003
Figure 4. Comparison of empirical and semi-empirical estimates of KL divergence using GP distribution as model and real sea clutter data. (a) Forward. (b) Reverse.
Figure 4. Comparison of empirical and semi-empirical estimates of KL divergence using GP distribution as model and real sea clutter data. (a) Forward. (b) Reverse.
Entropy 26 01069 g004
Figure 5. Comparison of empirical and semi-empirical estimates of KL divergence using K distribution as model and real sea clutter data. (a) Forward. (b) Reverse.
Figure 5. Comparison of empirical and semi-empirical estimates of KL divergence using K distribution as model and real sea clutter data. (a) Forward. (b) Reverse.
Entropy 26 01069 g005
Figure 6. Comparison of variances of empirical and semi-empirical estimates of KL divergence using GP and K distribution as models and real sea clutter data. (a) Forward. (b) Reverse.
Figure 6. Comparison of variances of empirical and semi-empirical estimates of KL divergence using GP and K distribution as models and real sea clutter data. (a) Forward. (b) Reverse.
Entropy 26 01069 g006
Figure 7. Comparison of empirical and semi-empirical estimates of SH distance using GP and K distribution as models and real sea clutter data. (a) K distribution. (b) GP distribution.
Figure 7. Comparison of empirical and semi-empirical estimates of SH distance using GP and K distribution as models and real sea clutter data. (a) K distribution. (b) GP distribution.
Entropy 26 01069 g007
Figure 8. Comparison of variances of empirical and semi-empirical estimates of SH distance using GP and K distributions as models and real sea clutter data.
Figure 8. Comparison of variances of empirical and semi-empirical estimates of SH distance using GP and K distributions as models and real sea clutter data.
Entropy 26 01069 g008
Figure 9. Semi-empirical estimation of KL divergence between an empirical dataset following a unit-mean exponential distribution, Exp ( 1 ) , and a model distribution following a normal distribution, N ( 3 , 4 ) . (a) Forward estimation. (b) Reverse estimation.
Figure 9. Semi-empirical estimation of KL divergence between an empirical dataset following a unit-mean exponential distribution, Exp ( 1 ) , and a model distribution following a normal distribution, N ( 3 , 4 ) . (a) Forward estimation. (b) Reverse estimation.
Entropy 26 01069 g009
Figure 10. MSE of the KL divergence estimation between an empirical dataset following a unit-mean exponential distribution, Exp ( 1 ) , and a model distribution following a normal distribution, N ( 3 , 4 ) . (a) Forward. (b) Reverse.
Figure 10. MSE of the KL divergence estimation between an empirical dataset following a unit-mean exponential distribution, Exp ( 1 ) , and a model distribution following a normal distribution, N ( 3 , 4 ) . (a) Forward. (b) Reverse.
Entropy 26 01069 g010
Figure 11. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 3 , 4 ) and exponential model distribution Exp ( 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Figure 11. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 3 , 4 ) and exponential model distribution Exp ( 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Entropy 26 01069 g011
Figure 12. Semi-empirical estimation of the KL divergence between two normal distributions, with the empirical dataset following N ( 0 , 1 ) and the model distribution following N ( 0 , 2 ) . (a) Forward estimation. (b) Reverse estimation.
Figure 12. Semi-empirical estimation of the KL divergence between two normal distributions, with the empirical dataset following N ( 0 , 1 ) and the model distribution following N ( 0 , 2 ) . (a) Forward estimation. (b) Reverse estimation.
Entropy 26 01069 g012
Figure 13. MSE of KL divergence estimation between two normal distributions, empirical dataset following N ( 0 , 1 ) and model distribution following N ( 0 , 2 ) . (a) Forward. (b) Reverse.
Figure 13. MSE of KL divergence estimation between two normal distributions, empirical dataset following N ( 0 , 1 ) and model distribution following N ( 0 , 2 ) . (a) Forward. (b) Reverse.
Entropy 26 01069 g013
Figure 14. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 1 ) and normal model distribution N ( 0 , 2 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Figure 14. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 1 ) and normal model distribution N ( 0 , 2 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Entropy 26 01069 g014
Figure 15. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 1 ) and normal model distribution N ( 1 , 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Figure 15. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 1 ) and normal model distribution N ( 1 , 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Entropy 26 01069 g015
Figure 16. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 1 ) and normal model distribution N ( 2 , 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Figure 16. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 1 ) and normal model distribution N ( 2 , 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Entropy 26 01069 g016
Figure 17. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 4 ) and normal model distribution N ( 1 , 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Figure 17. Semi-empirical estimation of SH distance between empirical dataset of samples from normal distribution N ( 0 , 4 ) and normal model distribution N ( 1 , 1 ) . (a) SH distance estimation. (b) MSE of SH distance estimation.
Entropy 26 01069 g017
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vondra, B. Semi-Empirical Approach to Evaluating Model Fit for Sea Clutter Returns: Focusing on Future Measurements in the Adriatic Sea. Entropy 2024, 26, 1069. https://doi.org/10.3390/e26121069

AMA Style

Vondra B. Semi-Empirical Approach to Evaluating Model Fit for Sea Clutter Returns: Focusing on Future Measurements in the Adriatic Sea. Entropy. 2024; 26(12):1069. https://doi.org/10.3390/e26121069

Chicago/Turabian Style

Vondra, Bojan. 2024. "Semi-Empirical Approach to Evaluating Model Fit for Sea Clutter Returns: Focusing on Future Measurements in the Adriatic Sea" Entropy 26, no. 12: 1069. https://doi.org/10.3390/e26121069

APA Style

Vondra, B. (2024). Semi-Empirical Approach to Evaluating Model Fit for Sea Clutter Returns: Focusing on Future Measurements in the Adriatic Sea. Entropy, 26(12), 1069. https://doi.org/10.3390/e26121069

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop