Open AccessArticle

Hybrid GRU–Random Forest Model for Accurate Atmospheric Duct Detection with Incomplete Sounding Data

Yi Yan

Linjing Guo

^1,*,

Jiangting Li

Zhouxiang Yu

Shuji Sun

²,

Tong Xu

²,

Haisheng Zhao

² and

Lixin Guo

School of Physics, Xidian University, Xi’an 710071, China

China Institute of Radio Wave Propagation, Qingdao 266107, China

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(22), 4308; https://doi.org/10.3390/rs16224308

Submission received: 21 August 2024 / Revised: 3 November 2024 / Accepted: 14 November 2024 / Published: 19 November 2024

(This article belongs to the Special Issue Artificial Intelligence and Big Data for Oceanography)

Download

Browse Figures

Figure 1
The unit structure of RNN. "> Figure 2
The unit structure of GRU. "> Figure 3
Random Forest Schematic. In this figure, the unit of temperature T is Celsius, the unit of pressure press is hPa, the unit of relative humidity rh is %, and the unit of wind speed V is m/s. "> Figure 4
Tropospheric electromagnetic wave refraction. "> Figure 5
Comparison of vertical resolution in two data sets. "> Figure 6
Deep learning process of atmospheric parameter estimation. "> Figure 7
The comparison between the predicted value and the real value of the temperature at 1000 hPA in the whole dataset model. "> Figure 8
Temperature prediction effect diagram. Where (a) is the temperature prediction diagram at 1000 hPa, (b) is the temperature prediction diagram at 925 hPa. "> Figure 9
Height prediction effect diagram. Where (a) is the height prediction diagram at 1000 hPa, (b) is the height prediction diagram at 925 hPa. "> Figure 10
Vapor Pressure prediction effect diagram. Where (a) is the vapor pressure prediction diagram at 1000 hPa, (b) is the vapor pressure prediction diagram at 925 hPa. "> Figure 11
Wind Speed prediction effect diagram. Where (a) is the wind speed prediction diagram at 1000 hPa, (b) is the wind speed prediction diagram at 925 hPa. "> Figure 12
Wind Direction prediction effect diagram. Where (a) is the wind direction prediction diagram at 1000 hPa, (b) is the wind direction prediction diagram at 925 hPa. "> Figure 13
The temperature prediction residual at 1000 hPa. The red dashed line represents y = 0, indicating the level where the model predictions perfectly match the actual observations. "> Figure 14
Residual graph effect: (a) The frequency histogram of the residuals at 1000 hPa illustrates the distribution of prediction errors. (b) The Q-Q plot compares the quantiles of the residuals to the quantiles of a normal distribution, the red line represents the theoretical normal distribution line. "> Figure 15
Cross validation analysis. In the figure, ‘3, 16’ represents the decision trees with a number of 16 and a depth of 3, respectively. Others are similar. The orange dashed box indicates where the model scores the highest and performs the best. "> Figure 16
Random forest performance analysis diagram. Schemes follow another format. (a) The Confusion Matrix of Random; (b) The Area Under the Curve (AUC) of Random. "> Figure 17
KS Curve. "> Figure 18
The prediction results of the shortened dataset for the temperature at 1000 hPa. ">

Versions Notes

Abstract

Atmospheric data forecasting traditionally relies on physical models, which simulate atmospheric motion and change by solving atmospheric dynamics, thermodynamics, and radiative transfer processes. However, numerical models often involve significant computational demands and time constraints. In this study, we analyze the performance of Gated Recurrent Units (GRU) and Long Short-Term Memory networks (LSTM) using over two decades of sounding data from the Xisha Island Observatory in the South China Sea. We propose a hybrid model that combines GRU and Random Forest (RF) in series, which predicts the presence of atmospheric ducts from limited data. The results demonstrate that GRU achieves prediction accuracy comparable to LSTM with 10% to 20% shorter running times. The prediction accuracy of the GRU-RF model reaches 0.92. This model effectively predicts the presence of atmospheric ducts in certain height regions, even with low data accuracy or missing data, highlighting its potential for improving efficiency in atmospheric forecasting.

Keywords:

atmospheric data forecasting; GRU; LSTM; RF; atmospheric duct prediction

1. Introduction

The troposphere, the lowest layer of Earth’s atmosphere, is influenced by a variety of weather processes, both random and systematic, which complexly affect the atmospheric state (temperature, pressure, humidity, etc.) and transport (waves, ozone, etc.) [1,2,3,4]. These variations significantly alter the atmospheric refractive index, a key factor in how light and radio waves travel through the air [5,6]. When the atmospheric refractive index shows a negative gradient, it can trap electromagnetic waves close to the surface, creating what is known as an atmospheric duct [7,8]. This condition not only expands the detection area of radars but can also introduce ‘blind spots’ where detection fails [9]. Understanding and predicting these ducts are crucial, as they have profound impacts on radar operations and communication systems, especially in marine zones where these conditions persist throughout the year [10].

Accurately predicting atmospheric ducts is crucial, and a variety of methods have been developed to address this challenge. Traditional approaches include physical and numerical models, such as the MM5 mesoscale model and the Weather Research and Forecasting (WRF) model. These models, based on principles of atmospheric physics, thermodynamics, and hydrodynamics, simulate the formation and dynamics of atmospheric ducts. For instance, Liu et al. [11] applied the WRF model to study atmospheric ducts in China’s coastal seas, providing valuable insights into their spatial and temporal distribution. While these physical models are well established, they often require significant computational resources and can struggle to account for the nonlinear, complex nature of atmospheric phenomena, particularly in dynamic environments like marine regions.

In response to these limitations, data-driven inversion techniques have gained popularity. These methods infer atmospheric refractive index profiles using radar clutter data, such as the Refractivity from Clutter (RFC) method [12,13]. For example, Yang et al. [14] reversed the surface duct model using radar sea clutter power (RSCP) from the parabolic wave equation (PWE) and a Bayesian-optimized random forest algorithm, which demonstrated a novel approach to improve the accuracy of surface duct predictions. Additionally, optimization algorithms, such as genetic algorithms [15] and particle swarm optimization [16], have been applied to improve the accuracy of evaporation duct modeling. Despite their success, these approaches face difficulties when dealing with noisy or incomplete data—a frequent challenge in real-world atmospheric studies.

To overcome these issues, statistical modeling techniques have been widely adopted, particularly in time-series predictions. Classical statistical models, such as the Autoregressive Integrated Moving Average (ARIMA) and Autoregressive Conditional Heteroscedasticity (ARCH) models, have been used extensively in the atmospheric sciences for predicting various phenomena [17,18,19,20]. However, these models often struggle with capturing the nonlinear and chaotic nature of atmospheric data. This has led to the increasing adoption of artificial intelligence (AI) models, such as Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), and Gated Recurrent Units (GRUs), which are better equipped to handle complex dependencies and nonlinear data patterns [21]. Nonetheless, incomplete or missing data due to environmental interference remains a key challenge for these AI models.

A significant challenge in the field is the optimal handling of such incomplete datasets, especially in predicting atmospheric duct phenomena where high vertical resolution is paramount. Traditional approaches might fail to effectively interpolate missing data or capture subtle atmospheric nuances in baroclinic layers, where data are often scarce or noisy. This paper introduces a novel methodology that synergizes the predictive power of GRU models with the robustness of Random Forest (RF) algorithms, specifically designed to address the challenges posed by data deficiencies in atmospheric duct detection.

Utilizing sounding data from Xisha Island in the South China Sea, this study trains the GRU model to forecast atmospheric parameters from historical patterns, feeding these predictions into an RF model to assess the presence of ducts. This dual-model approach leverages the GRU’s capacity for handling time-series data with temporal dependencies and the RF’s strength in integrating multiple decision trees to improve prediction accuracy. Notably, when comparing the GRU and LSTM under identical setups, the GRU not only exhibited shorter training durations but also reduced prediction errors, particularly optimal when configured with three layers. The combined GRU-RF model achieved a remarkable 92% accuracy in predicting the presence of ducts between 50 m and 300 m heights. This result not only demonstrates the model’s capability in tackling data-sparse environments but also positions our methodology at the forefront of contemporary atmospheric prediction techniques, offering a promising solution to a long-standing problem in meteorological data analysis.

2. Materials and Methods

2.1. Neural Networks for Time Series

The Recurrent Neural Network (RNN) is a specialized type of neural network crafted for processing sequential data. Its architecture uniquely facilitates connections across successive nodes via time steps, thereby capturing temporal correlations within data sequences [22,23]. Nevertheless, conventional RNNs frequently encounter issues of vanishing and exploding gradients when handling long-term dependencies. This inherent flaw complicates the extraction of relationships across extended intervals within sequences. Figure 1 depicts the fundamental unit structure of an RNN, illustrating its straightforward nodal connectivity which, in turn, underscores its challenges in managing complex sequence data [24,25].

To address these challenges, the Long Short-Term Memory Network (LSTM) was developed. The LSTM enhances the capabilities of traditional RNNs by integrating a gating mechanism, which consists of three principal units: the forget gate, the input gate, and the output gate. The forget gate decides which pieces of information to remove from the cell state, whereas the input gate determines what new data should be incorporated into the cell state. The output gate regulates the influence of the current cell state on the output of the hidden layer. This feature enables the LSTM to sustain long-term memory in the presence of long-term dependencies, thereby effectively capturing pertinent information within sequence data.

The cell state in Long Short-Term Memory (LSTM) networks has the capacity to store and convey information over extended periods, while the state of the hidden layer primarily dictates the output at each time step, influencing predictions and decisions in subsequent tasks. Through its sophisticated gating mechanism, the LSTM adeptly addresses the challenges of gradient vanishing and explosion that traditional RNNs often face, thus demonstrating robust performance across a myriad of sequence prediction tasks.

In 2014, Cho et al. [26] introduced the Gated Recurrent Unit (GRU), a streamlined variant of the LSTM. The GRU consolidates the input and output gates into a single update gate, effectively reducing the model’s parameter count and decreasing the likelihood of overfitting while maintaining performance comparable to that of the LSTM. The GRU’s architecture is notably less complex, utilizing a combination of reset and update gates to effectively model long-term dependencies, thereby lowering computational demands. The GRU’s reset gate selectively resets the hidden state, while the update gate balances the update of new information with the retention of previous states. The candidate hidden state provides a new candidate for the next hidden state, modulating the influence of current inputs. This configuration allows GRUs to effectively manage long-term dependencies without the need for separate memory cells, leading to efficient training and inference. Furthermore, GRUs’ reduced complexity and lower computational requirements make them particularly advantageous for real-time applications and deployment on devices with limited resources. Figure 2 illustrates the unit structure of the GRU, highlighting its simplified design and enhanced computational efficiency.

The GRU is composed of four main components: the reset gate, the update gate, the candidate hidden state, and the current hidden state. Each component plays a pivotal role in modulating the flow of information during the training process, allowing the network to retain essential information over long sequences without the exponential decay of gradients. This structured approach enables GRUs to excel in tasks involving sequential data, such as time series prediction, text generation, and speech recognition, making it a versatile and powerful tool in the arsenal of deep learning techniques. The four components of a GRU are specified by the following:

Update Gate: The update gate determines how much information of the past time step should be retained in the current time step model. Its calculation formula is:

z_{t} = σ (W_{z} h_{t - 1} + U_{z} x_{t} + b_{z}),

(1)

where

z_{t}

is the value of the update gate,

h_{t - 1}

is the hidden state of the previous time step,

x_{t}

is the input of the current time step,

W_{z}

and

U_{z}

are the weight matrices of the update gate,

b_{z}

is the corresponding bias vector, and

σ

is the Sigmoid function.

Reset Gate: The reset gate determines how to combine the hidden state of the previous time step with the input of the current time step to calculate the candidate hidden state. Its calculation formula is:

r_{t} = σ (W_{r} h_{t - 1} + U_{r} x_{t} + b_{r}),

(2)

where

r_{t}

is the value of the reset gate,

W_{r}

and

U_{r}

are the weight matrices of the reset gate,

b_{r}

is the corresponding bias vector.

Candidate Hidden State: The candidate hidden state is calculated based on the reset gate, which is used to update the candidate value of the hidden state of the current time step. Its calculation formula is:

\tilde{h_{t}} = \tanh (W_{h} h_{t - 1} + U_{h} x_{t} + b_{h}),

(3)

where

\tilde{h_{t}}

is the candidate hidden state,

W_{h}

and

U_{h}

are the corresponding weight matrices, and

b_{h}

is the corresponding bias vector.

Updated Hidden State: The final hidden state is obtained by combining the update gate and the candidate hidden state. Its calculation formula is:

h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ \tilde{h_{t}},

(4)

where

h_{t}

is the final hidden state,

(1 - z_{t}) ⊙ h_{t - 1}

controls the retention of past information, and

z_{t} ⊙ \tilde{h_{t}}

manages the update of new information to the hidden state.

2.2. Random Forest

A decision tree is a tree-structured machine learning model that uses a series of decision rules and potential outcomes to classify or regress data [27]. Each internal node in a decision tree represents a test on an attribute, each branch denotes a possible outcome of that test, and each leaf node indicates a prediction category or a specific decision result. Decision trees are noted for their flexibility and robustness in handling both regression and classification problems.

A random forest is an ensemble learning method that utilizes multiple decision trees to enhance prediction accuracy and stability [28,29]. This method creates a ‘strong’ model by combining several ‘weak’ decision trees. Figure 3 illustrates a sample of the random forest model constructed in this study, displaying some decision trees and their classification criteria. Random forests are developed through the following steps to achieve efficient data classification or regression:

Random Sampling: Bootstrap sampling is employed to randomly draw samples from the original training set, forming the training data for each tree. This method ensures that the training data for each tree are independent and potentially diverse;
Feature Random Selection: At each split node of the decision tree, a subset of features is randomly selected for the split. This strategy minimizes the correlation among the trees within the model, introduces more randomness, and enhances the model’s generalization capability;
Building Trees: These steps are repeated to train each decision tree based on the selected samples and features;
Aggregation of Predictions: For classification tasks, a voting mechanism determines the final category; for regression tasks, the prediction results of all trees are averaged to compute the final prediction value.

Figure 3. Random Forest Schematic. In this figure, the unit of temperature T is Celsius, the unit of pressure press is hPa, the unit of relative humidity rh is %, and the unit of wind speed V is m/s.

Through these steps, the random forest method not only markedly improves the precision of predictions but also controls overfitting, making it an effective tool for a wide range of machine learning applications.

2.3. Atmospheric Duct Discrimination

The tropospheric atmosphere is an inhomogeneous medium. Its temperature, pressure and humidity change with time and space. The relationship between atmospheric refractive index and atmospheric temperature, pressure and water vapor pressure can be expressed as follows:

N = \frac{77.6 P}{T} + \frac{3.73 e \times 10^{5}}{T^{2}},

(5)

where

N

represents the atmospheric refractive index (N-unit),

e

is the water vapor pressure (hPa),

P

is the atmospheric pressure (hPa),

T

is the atmospheric thermodynamic temperature (K). The water vapor pressure

e

can be obtained from the temperature dew point difference and the atmospheric thermodynamic temperature:

e = 6.1078 \times \exp [\frac{a (T_{d} - 273.16)}{T_{d} - b}],

(6)

where

T_{d}

is the temperature dew point difference (K),

a

is a constant (usually 17.2693882),

b

is also a constant (usually 35.86), when the temperature

t \geq 258 . 15 K

. When considering the curvature of the earth, the modified atmospheric refractive index can be expressed as follows:

M = N + \frac{z}{r_{e}} \cdot 10^{6} = N + 157 \cdot z,

(7)

where

M

is the modified refractive index (M-unit),

r_{e}

is the average radius of the earth (take 6371 km),

z

is the height above sea level (km). The derivative of

M

with respect to height

z

is:

\frac{d M}{d z} = \frac{d N}{d z} + 157,

(8)

when the refractive gradient satisfies the condition

d N / d z < - 157

or the modified refraction gradient satisfies the condition

d M / d z < 0

, the electromagnetic wave will undergo a hazardous capture refraction. At this point, we consider that there is an atmospheric duct in this region [30,31].

Figure 4 illustrates the propagation of electromagnetic waves under varying atmospheric conditions, categorized by the gradient of the refractive index. Each category corresponds to a specific type of propagation phenomenon. When

d N / d z > 0

the condition is termed ‘sub-refraction’, characterized by a bending of the waves away from the surface. For values between

0 > d N / d z > - 79

, the propagation is in the ‘standard refraction’ range, where electromagnetic waves bend normally towards the Earth’s surface. The range

- 79 > d N / d z > - 157

is defined as ‘super-refraction’, in which waves bend more significantly towards the ground, enhancing the range and strength of signal transmission. Lastly, when

d N / d z < - 157

, an atmospheric ducting effect occurs, known as ‘atmospheric duct propagation’. This condition allows for the trapping and guiding of waves in a layer of the atmosphere, facilitating long-distance communication beyond the horizon. This classification highlights the crucial impact of atmospheric variations on the behavior of electromagnetic wave propagation.

3. Results and Analysis

3.1. Data

The Integrated Global Radiosonde Archive (IGRA) is a comprehensive global system for archiving radiosonde data, managed by the National Centers for Environmental Information (NCEI) [32]. Radiosondes are typically launched twice daily at 00:00 and 12:00 UTC (00 Z and 12 Z), providing standardized atmospheric observations worldwide. The dataset includes measurements from standard pressure levels (e.g., 1000 hPa, 925 hPa, 850 hPa, 700 hPa), surface observations, and data at the tropopause and other floating levels. Variables include air pressure, temperature, geopotential height, relative humidity, dew point depression, wind direction, wind speed, and elapsed time since launch.

IGRA applies rigorous quality control measures to ensure data accuracy. However, data gaps may occur due to factors such as equipment malfunctions or operational issues during the radiosonde’s ascent. In some cases, measurements are available only at standard pressure levels, resulting in sparser datasets, particularly under challenging environmental conditions in the upper atmosphere.

This study conducts a comprehensive analysis of radiosonde data sourced from Xisha Island in the South China Sea. The dataset spans an extensive period from 1 January 2000, to 28 June 2024, amounting to a total of 8580 days of observations (https://www.ncei.noaa.gov/ accessed on 29 June 2024). As illustrated in Figure 5, the dataset is categorized into two distinct types based on data availability and accuracy: low-precision and high-precision data.

The low-precision data, indicated by red dots in Figure 5, correspond to measurements available only at specific atmospheric pressure levels. While these data points have lower vertical resolution, they are consistently available throughout the observation period, ensuring temporal continuity in the dataset. Given their reliability across time, these data were selected for use in the time-series models covering the full range from 2000 to 2024. Although limited in vertical detail, these measurements remain crucial for capturing long-term atmospheric trends and broad changes in atmospheric conditions.

The high-precision data, represented by black areas in Figure 5, include measurements from additional floating pressure levels, offering a more detailed vertical profile of the atmosphere. These finer-resolution data, covering pressure ranges from near sea level to the mid-troposphere (around 70,000 Pa), are valuable for analyzing the temporal and spatial variability of atmospheric parameters in greater depth. The inclusion of temperature, pressure, and humidity at various heights makes it possible to compute refractivity values and their vertical gradients. Furthermore, these data allow the calculation of modified atmospheric refractive index gradients, following Equations (5)–(8), to assess the presence of atmospheric ducts—critical regions where electromagnetic waves can propagate over extended distances—and other fine-scale meteorological phenomena.

Over the 24-year dataset, if there are 16 consecutive days with complete data in the standard pressure layer (with the first 15 days serving as inputs and the 16th day as the output), these 16 days of data are grouped together as a single set and included in Dataset 1. This dataset is specifically used for training time series neural networks, and a total of 6061 datasets were obtained in this manner, which is sufficient for effective neural network training. In contrast, there are 1700 days with data recorded from floating pressure layers, but fewer than 100 instances where continuous data spans 16 days, making this dataset too sparse to support time series neural network predictions. Furthermore, the detection of floating pressure layers by radiosondes exhibits a certain degree of randomness, rendering it unsuitable as a consistent output for neural network training. Therefore, the data from floating pressure layers are compiled into Dataset 2, which is utilized to train a random forest model.

3.2. Prediction of Atmospheric Parameters

Figure 6 illustrates the detailed process of using a GRU model to predict atmospheric parameters. Initially, due to varying scales among the variables in the dataset, normalization is applied to enhance numerical stability and training efficiency. The input data encompass 20 variables from the past 15 days, including temperature, dew point difference, height, wind speed, and wind direction at pressure levels of 1000 hPa, 925 hPa, 850 hPa, and 700 hPa. To effectively utilize these data, the dataset is segmented into different portions: the initial 78% forms the training set, the next 11% (from 78% to 89%) constitutes the validation set, and the final 11% is designated as the test set. The training set is used to train the model and determine its learnable parameters such as weights and biases. The validation set aids in selecting hyperparameters like learning rate and training epochs, while the test set is used for the final evaluation of the model. The output from each model is a forecast of one of these variables for the subsequent 24 h. The sliding window technique facilitates the extraction of relevant input-output pairs from the original dataset, which are then further divided into training, validation, and test sets.

In order to comprehensively evaluate the performance of the model, we use mean square error (MSE) and mean absolute angle error (MAAE) as evaluation indicators for different variables. MSE measures the average value of the square of the difference between the predicted value and the true value of the model. The formula is as follows:

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2},

(9)

where

{\hat{y}}_{i}

is the predicted value of the model,

y_{i}

is the true value. In order to evaluate the accuracy of wind direction predictions, we also used MAAE, which is calculated as follows:

M A A E = \frac{1}{n} \sum_{i = 1}^{n} (180 - ||θ_{i} - {\hat{θ}}_{i}| - 180|),

(10)

where

θ_{i}

and

{\hat{θ}}_{i}

represent the true angle and the predicted angle, respectively. By minimizing MSE and MAAE, the model can not only reduce the prediction error, but also improve the ability to capture the angle change trend.

For each variable under study, an independent GRU-based prediction model is constructed, following a multi-input single-output architecture. Each model comprises an input layer, a GRU layer, and a fully connected output layer. The input to each model is structured as (5754, 15, 20), representing the total data points, the sequence length, and the number of features, respectively. Conversely, the output dimension is configured as (5754, 1), indicating the predicted values for each data point. The model’s architecture includes a hidden layer size of 64, and it employs three GRU layers to enhance the learning capabilities. The training process is repeated over 20 epochs to optimize the model parameters. Each training batch consists of 64 data sets, with a learning rate meticulously set to 0.001 to ensure efficient convergence without overshooting the minima. The effectiveness of each model is rigorously evaluated against the test set data, with a particular focus on minimizing prediction error magnitude and achieving high accuracy in capturing the trend changes of the variables.

In addition, we experimented with two different modeling approaches: constructing 20 single-output models and a single multi-output model that predicts 20 variables simultaneously. The difference in training loss between these two approaches was less than 5%, indicating similar predictive performance. However, when plotting the training loss against epochs, the loss of the multi-output model plateaued at around 0.4, due to the aggregation of the losses from multiple variables. This made it difficult to assess the performance of individual variables and to adjust hyperparameters based on the loss curves. Therefore, we opted to construct 20 independent single-output models, which allowed for a clearer evaluation of the prediction performance for each variable and facilitated hyperparameter tuning.

Figure 7 illustrates the predicted and actual temperature values at 1000 hPa for the training, validation, and test sets. In the figure, the blue line represents the actual values of the training set, while the orange line shows the predicted values. Similarly, the green line corresponds to the actual values of the validation set, with the red line indicating the predicted values. The purple line represents the actual values of the test set, and the brown line depicts the predicted values. Notably, there is a significant fluctuation in the actual values around March 2024, which deviates from the typical trend observed in previous years.

Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 show the predicted results of some parameters, where the blue line represents the actual value of the variable and the orange line represents the predicted value of the model. The three variables of direct correlation variables of temperature and pressure, height and water vapor pressure, all show the variation law of seasonal height correlation. Their real values fluctuate greatly around March 2024, and the neural network prediction model fails to capture this change well. This prediction error may be due to the strong nonlinear characteristics of these variables and the scarcity of extreme event samples in training data. However, in other time periods, the predicted values of the two variables of temperature and height are highly consistent with the true values. In addition, the wind speed and wind direction of the two variables can be affected by other factors and produce mutations, which is to the law of their more difficult to capture.

Table 1 presents the R² scores of various variables at 1000 hPa across the training, validation, and test datasets. An R² score greater than 0 indicates that the model is effective, and the closer the score is to 1, the better the model’s performance. As shown in the table, the temperature fraction is the highest, and the wind speed and wind direction fractions are lower, but also greater than 0.5. Since the training set is relatively close to the validation set/test set R², it indicates that the model has no serious over-fitting phenomenon. The model performs well on the training set and maintains high performance on other datasets.

The residual plot offers a clear visualization of the model’s error distribution, making it easier to identify any potential systematic deviations. Figure 13 displays the residuals between the predicted and actual temperatures at an atmospheric pressure of 1000 hPa. The mean of the residuals is calculated to be 0.192255, with a standard deviation of 0.73725. These values suggest that there is no significant systematic deviation in the model, and the predicted fluctuations remain relatively stable.

Additionally, we generated a histogram of the residuals and a Q-Q plot (Quantile-Quantile Plot) based on the residual plot, as shown in Figure 14. The residuals’ distribution closely approximates a normal distribution, further supporting the model’s validity. The normality of the residuals indicates that the prediction errors are random and do not exhibit any systematic trends, suggesting that the model effectively captures the underlying relationships within the data. This enhances the reliability of the model’s parameter estimates and provides a solid foundation for subsequent statistical inference and the construction of prediction intervals.

As shown in Table 2, this study calculates the prediction error of each variable with three layers of GRU. These variables include temperature, height, water vapor pressure, wind speed and wind direction, which are predicted under different atmospheric pressure conditions. We use MSE to measure the prediction accuracy of temperature, water vapor pressure, height and wind speed, while the wind direction error is evaluated by MAAE.

In order to further evaluate the performance of the model, this study constructed and compared GRU and LSTM models with different layers, with particular attention to the variable prediction of the 1000 hPa layer. Through the average error obtained by 20 independent operations (see Table 3), it is found that the three-layer GRU model performs best in the prediction of temperature, wind speed, and water vapor pressure, with the smallest error. The single-layer GRU is more accurate in height and wind direction prediction. Considering that the formation of the duct layer depends on the characteristics of the inversion layer and the inversion wet layer, the excellent performance of the three-layer GRU under complex meteorological conditions highlights that it is the best model choice for this study.

In addition, although the training time of the model increases with the number of layers, the GRU model usually has a shorter training cycle than the LSTM model under the same training conditions, with slightly improved accuracy. Notably, the “Time/s” metric in Table 3 represents the time (in seconds) required to train the model, offering a quantitative comparison of the training efficiency between models. This metric is crucial in evaluating the trade-off between training speed and model performance, especially when dealing with large datasets or complex architectures.

3.3. Atmospheric Duct Prediction Based on Random Forest

In this study, the radiosonde data with vertical high precision were used to label the duct phenomenon for more than 1600 days to determine whether there was an atmospheric duct in each height layer. The first 70% of the dataset is used as the training set, and the remaining 30% is used for testing. Firstly, the random forest (RF) model is trained using sounding data at a specific height level, and then the prediction data of the GRU model is input into the RF model to verify the reliability of the GRU-RF combined model.

Three models were developed to predict the presence of ducts at different height ranges: 50 m to 300 m (Model 1), 300 m to 800 m (Model 2), and 800 m to 1500 m (Model 3). Taking Model 1 as an example, the input data consist of the values of each variable from the standard pressure layer. If a duct is detected within the 50–300 m range, the output label is set to 1, and if no duct is present, it is set to 0. Typically, surface ducts are found between 50 and 300 m, while elevated ducts are more likely to occur above 300 m.

To optimize the model parameters, this study employed a comprehensive grid search method [33]. This approach systematically explored a range of key parameters, including the splitting criterion (criterion), the maximum depth of the decision tree, the number of trees in the ensemble, the proportion of features selected at each split, and the minimum number of samples required to split a node. This thorough parameter tuning process ensured the model’s optimal performance and robustness.

The performance evaluation results are shown in Table 4, where support is the number of samples, and precision, recall, F1-score and accuracy are used to evaluate the performance of the model. The calculation formula is as follows:

P r e c i s i o n = \frac{T P}{T P + F P},

(11)

R e c a l l = \frac{T P}{T P + F N},

(12)

F 1 - s c o r e = 2 \cdot \frac{P r e c i s o n \times R e c a l l}{P r e c i s o n + R e c a l l},

(13)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(14)

In predictive modeling, the performance of a classifier is typically assessed using several key metrics. These include: true positive (TP), which occurs when the model correctly predicts a positive instance—such as correctly identifying the presence of atmospheric ducts; true negative (TN), where the model accurately identifies a negative instance—such as correctly determining the absence of atmospheric ducts when none are present. False positive (FP) refers to the case where the model incorrectly labels a negative instance as positive, such as predicting the existence of atmospheric ducts when none exist. False negative (FN) occurs when the model fails to recognize a true positive and incorrectly classifies it as negative—for example, predicting no atmospheric ducts when they are actually present. These indicators are the basis for calculating model accuracy, recall rate, F1 score and overall accuracy. Accuracy measures the accuracy of positive predictions, recall rate evaluates the ability of the model to identify all relevant instances, F1 scores provide a balance between accuracy and recall rate, and accuracy reflects the proportion of all correct predictions made by the model. These indicators together help to evaluate the effectiveness of the classification system under various conditions.

Model 1 demonstrates strong performance in detecting duct phenomena within the 50–300 m range, achieving a discrimination accuracy of 95% for non-duct events, 83% for duct events, and an overall accuracy of 92%. In contrast, Models 2 and 3 exhibit lower accuracy in predicting duct events within the 300–1500 m height range.

K-fold cross-validation is a model validation method commonly used in machine learning to evaluate the performance of models on unseen data [34,35]. This method is particularly useful because it can reduce the risk of overfitting the model on a specific dataset and provide a stable estimate of the model performance. In this paper, 4-fold cross-validation is used. By training and testing models on 4 different data subsets, the cross-validation results under different parameter settings are compared, and the best parameter combination is selected.

Figure 15 shows the average score using four-fold cross-validation, analyzing the influence of tree depth (3, 4, 5) and the number of trees (16, 18, 20) on the training results. Through four-fold cross-validation, we can effectively evaluate the generalization ability of the model under different parameter settings. In this analysis, when the number of trees is set to 18, the model achieves the optimal cross-validation score, indicating that this configuration strikes a balance between model performance and computational efficiency, making it a cost-effective and reasonable choice under the current dataset and model configuration.

The depth of the tree determines the maximum number of nodes per tree, which affects the level of detail that the model can learn. Deeper trees can simulate more complex decision boundaries, but are also easier to overfit. On the contrary, although the shallower tree has a large deviation, the variance is small and the generalization ability is strong. In this analysis, the depth setting of 5 provides the highest average score, indicating that this depth is sufficient to capture data features while avoiding overfitting of the model on training data.

Figure 16 presents the training results of the random forest model applied in duct classification, showcasing both the confusion matrix and the AUC curve. The confusion matrix provides a clear depiction of the model’s performance, detailing TN at 315, TP at 91, FP at 24, and FN at 15. These values indicate that the model effectively identifies the correct duct states with a high degree of accuracy, minimizing both the misidentification of valid states as failures (FP) and the oversight of actual failures (FN). This matrix is essential for assessing the accuracy and reliability of the model at various decision thresholds. The AUC curve, an essential metric for binary classification models, is illustrated through the ROC (Receiver Operating Characteristic) curve. This curve highlights the model’s ability to distinguish between functional and defective ducts across different threshold settings. It balances the true positive rate (TPR) against the false positive rate (FPR), offering a visual representation of model efficacy. With an AUC value of 0.89, the model demonstrates a high likelihood of correctly prioritizing more severe issues over less critical ones. An AUC value near 1 indicates an optimal balance between sensitivity and specificity, suggesting that the model performs robustly in identifying duct anomalies. This combination of a detailed confusion matrix and a strong AUC value confirms the model’s effectiveness in practical scenarios.

In this study, the model shows a strong ability to distinguish the presence or absence of atmospheric ducts, which can be proved from Figure 17. The Kolmogorov–Smirnov (KS) statistic of the model reaches 0.829 at a threshold of 0.230, highlighting its strong discriminant ability. In addition, since the number of non-ducting events is much larger than that of ducting events, the cumulative distribution function (CDF) curve of non-ducting events quickly approaches 1, and the CDF curve of ducting events approaches 1 in the later stage. These findings collectively emphasize the excellent overall prediction performance of the model, especially in improving classification accuracy. The results show that the model has great potential in providing reliable classification results in the actual detection of atmospheric duct events.

4. Discussion

In this study, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 and Table 1, Table 2 and Table 3 present the performance of the GRU model across various atmospheric parameters. The model demonstrated the highest accuracy in temperature prediction, with the MSE of 0.51201, significantly lower than that of other variables. This exceptional performance is primarily due to the inherent stability and continuity of temperature data, which are largely influenced by long-term seasonal and climatic trends rather than sudden fluctuations. Such conditions are ideal for time series-based models like GRU, enabling them to effectively learn and predict temperature patterns. However, when we expanded the dataset to include data up to June 2024, our predictions for March 2024 exhibited a significant duct effect, leading to increased prediction errors (as illustrated in Figure 18). We calculated the R² scores for the model across the training set, validation set, and test set, which were 0.9152, 0.9177, and 0.4820, respectively. Both the validation and test sets consist of data not utilized in model training and are intended to evaluate the model’s quality. Notably, the R² scores for these two sets differ considerably, suggesting that the underlying patterns in the two segments of the dataset may vary.

In Figure 7, we illustrate the temperature trends at 1000 hPa over the past 20 years. The actual temperature for March 2024 shows considerable fluctuations, diverging significantly from typical patterns observed in previous Marches. While machine learning models can learn temperature change patterns from the training set, unrecognized patterns in the test set may impair the neural network’s performance. To improve the model, we propose three potential methods: first, incorporating techniques similar to EMD to enhance data analysis; second, utilizing ensemble learning to classify data by frequency and integrating predictions from multiple models; and finally, expanding the dataset by including similar feature data from other sources or leveraging transfer learning to improve accuracy amid variable changes.

For wind speed and direction, which are influenced by rapid changes and external factors, employing shorter time intervals in the time series is essential for better predictions. Reducing the current interval from 24 h to 12 or even 6 h could enhance the model’s ability to capture dynamic shifts. Additionally, methods such as wavelet transform or variational mode decomposition can be applied to decompose the data and extract more detailed features, further improving prediction accuracy.

Furthermore, our research includes a comparative analysis between GRU and LSTM models, with a specific focus on the 1000 hPa level. The three-layer GRU model outperforms others in predicting temperature, wind speed, and water vapor pressure, exhibiting the lowest errors. Additionally, the single-layer GRU shows greater precision in predicting height and wind direction. This highlights that the depth of the GRU layers is crucial for enhancing predictive performance, particularly under complex meteorological conditions. These findings align with existing literature, suggesting that multi-layer neural networks, such as the three-layer GRU, are more effective at capturing complex data dependencies, a critical feature in meteorological applications where multiple atmospheric variables dynamically interact.

We employed four-fold cross-validation to assess the impact of the number and depth of decision trees on the performance of our random forest model. Our findings indicate that increasing the number of trees enhances the model’s ability to capture complex data structures accurately and reduce overfitting, thus improving its stability and precision. With 18 trees, we achieved the optimal balance between cost-effectiveness and performance, demonstrating superior results under our current model configuration and dataset. Tree depth is crucial as well: deeper trees better replicate complex decision boundaries but also raise the risk of overfitting. Conversely, shallower trees might exhibit greater bias but have smaller variance, which improves their generalization ability. We found that a tree depth of five strikes an excellent balance between detailed representation and broad applicability.

Among the three random forest models constructed, Model 1 performs best. In Model 1, for in-stance, using the confusion matrix and ROC curve AUC analysis, we highlighted the model’s classification accuracy in identifying duct phenomena between 50 and 300 m. The accuracy rates are impressive—95% for non-duct events, 83% for duct events, and an overall accuracy of 92%. In contrast, Model 2 and Model 3 have lower accuracy in detecting the presence of ducts, which is greatly related to the characteristics of the dataset. The input variables of the RF model are the data under the standard pressure layer (1000 hPa, 925 hPa, 850 hPa, 700 hPa), which correspond to the height of about 50 m, 700 m, 1500 m and 3000 m, respectively. Because of these large height intervals, it is difficult to accurately analyze the existence of ducts. Therefore, the model has certain limitations in higher-level duct detection. Especially in the case of complex duct distribution, the spatial resolution of existing data cannot meet the needs of accurate capture.

Looking forward, future research could benefit from employing ensemble learning techniques to further enhance the accuracy and robustness of GRU models. By aggregating outputs from multiple GRU models, it might be possible to reduce errors and bolster stability. Investigating multivariate predictions could provide deeper insights into the dynamic interactions among variables, enhancing forecasts for complex systems. For the RF models, forthcoming efforts could concentrate on optimizing parameters specifically for high-height duct prediction, experimenting with different feature engineering approaches, or developing more intricate model architectures to better capture the nuances of duct phenomena. Additionally, more meticulous parameter tuning and cross-validation could enhance the models’ generalization capabilities and accuracy across different height intervals.

5. Conclusions

In this study, we explore the combination of time series prediction neural network and random forest model to predict the existence of ducts. The data used in this study were derived from the Xisha Island in the South China Sea, covering 7000 days of effective IGRA sounding data from 2000 to June 2024. We used GRU and LSTM neural networks to construct prediction models with different layers, and found that the three-layer GRU model had the most significant prediction effect. In the experiment of atmospheric parameter prediction, the performance of GRU model is generally better than that of LSTM model under the same conditions, and the training time of GRU model is shortened by about 20%. In addition, we also use the random forest model to predict the existence of ducts in specific height regions, especially in the surface duct prediction, with an accuracy rate of up to 92%. However, in the region above 300 m, the prediction effect is relatively poor due to the low probability of duct occurrence and the lack of correlation with the input data. Through this research, we confirm the effectiveness of combining time series prediction neural network and random forest model, which provides valuable experience and reference for future research in related fields.

Author Contributions

Conceptualization, Y.Y.; methodology, Y.Y.; formal analysis, Y.Y.; investigation, L.G. (Linjing Guo) and J.L.; resources, S.S., T.X. and H.Z.; project administration, L.G. (Linjing Guo), J.L., S.S. and T.X.; writing—original draft preparation, Y.Y. and Z.Y.; and writing—review and editing, Y.Y., L.G. (Linjing Guo) and L.G. (Lixin Guo). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Stable Support Research Grant of the China Institute of Radio Wave Propagation (No. A132312191), the National Natural Science Foundation of China (Grant Nos. U20B2059, 62071353 and 62071348) and the Taishan scholars Program.

Data Availability Statement

Sounding data: https://www.ncei.noaa.gov/products/weather-balloon/integrated-global-radiosonde-archive, accessed on 28 June 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mesnard, F.; Sauvageot, H. Climatology of Anomalous Propagation Radar Echoes in a Coastal Area. J. Appl. Meteorol. Climatol. 2010, 49, 2285–2300. [Google Scholar] [CrossRef]
Hao, X.-J.; Li, Q.-L.; Guo, L.-X.; Lin, L.-K.; Ding, Z.-H.; Zhao, Z.-W.; Yi, W. Digital Maps of Atmospheric Refractivity and Atmospheric Ducts Based on a Meteorological Observation Datasets. IEEE Trans. Antennas Propag. 2022, 70, 2873–2883. [Google Scholar] [CrossRef]
Xia, Y.; Xie, F.; Lu, X. Enhancement of Arctic surface ozone during the 2020–2021 winter associated with the sudden stratospheric warming. Environ. Res. Lett. 2023, 18, 024003. [Google Scholar] [CrossRef]
He, Y.; Zhu, X.; Sheng, Z.; He, M. Identification of stratospheric disturbance information in China based on the round-trip intelligent sounding system. Atmos. Chem. Phys. 2024, 24, 3839–3856. [Google Scholar] [CrossRef]
Turton, J.D.; Bennetts, D.A.; Farmer, S.F.G. An introduction to radio ducting. Meteorol. Mag. 1988, 117, 245–254. [Google Scholar]
Shi, Y.; Wang, S.; Yang, F.; Yang, K. Statistical Analysis of Hybrid Atmospheric Ducts over the Northern South China Sea and Their Influence on Over-the-Horizon Electromagnetic Wave Propagation. J. Mar. Sci. Eng. 2023, 11, 669. [Google Scholar] [CrossRef]
Yang, C.; Wang, J. The investigation of cooperation diversity for communication exploiting evaporation ducts in the South China sea. IEEE Trans. Antennas Propag. 2022, 70, 8337–8347. [Google Scholar] [CrossRef]
Wang, S.; Yang, K.; Shi, Y.; Zhang, H.; Yang, F.; Hu, D.; Dong, G.; Shu, Y. Long-term over-the-horizon microwave channel measurements and statistical analysis in evaporation ducts over the Yellow Sea. Front. Mar. Sci. 2023, 10, 1077470. [Google Scholar] [CrossRef]
Ma, J.; Wang, J.; Yang, C. Long-range microwave links guided by evaporation ducts. IEEE Commun. Mag. 2022, 60, 68–72. [Google Scholar] [CrossRef]
Yang, N.; Su, D.; Wang, T. Atmospheric Ducts and Their Electromagnetic Propagation Characteristics in the Northwestern South China Sea. Remote Sens. 2023, 15, 3317. [Google Scholar] [CrossRef]
Liu, Q.; Zhao, X.; Zou, J.; Hu, T.; Qiu, Z.; Wang, B.; Li, Z.; Cui, C.; Cao, R. Investigating the spatio–temporal characteristics of lower atmospheric ducts across the China seas by performing a long–term simulation using the WRF model. Front. Mar. Sci. 2024, 11, 1332805. [Google Scholar] [CrossRef]
Gerstoft, P.; Rogers, L.T.; Krolik, J.L.; Hodgkiss, W.S. Inversion for refractivity parameters from radar sea clutter. Radio Sci. 2003, 38, 8053. [Google Scholar] [CrossRef]
Douvenot, R.; Fabbro, V. On the knowledge of radar coverage at sea using real time refractivity from clutter. IET Radar Sonar Navig. 2010, 4, 293–301. [Google Scholar] [CrossRef]
Yang, C.; Wang, Y.; Zhang, A.; Fan, H.; Guo, L. A Random Forest Algorithm Combined with Bayesian Optimization for Atmospheric Duct Estimation. Remote Sens. 2023, 15, 4296. [Google Scholar] [CrossRef]
Jang, D.; Kim, J.; Park, Y.B.; Choo, H. Study of an Atmospheric Refractivity Estimation from a Clutter Using Genetic Algorithm. Appl. Sci. 2022, 12, 8566. [Google Scholar] [CrossRef]
Wang, B.; Wu, Z.-S.; Zhao, Z.; Wang, H.-G. Retrieving evaporation duct heights from radar sea clutter using particle swarm optimization (PSO) algorithm. Prog. Electromagn. Res. M 2009, 9, 79–91. [Google Scholar] [CrossRef]
Newbold, P. ARIMA model building and the time series analysis approach to forecasting. J. Forecast. 1983, 2, 23–35. [Google Scholar] [CrossRef]
Ho, S.L.; Xie, M. The use of ARIMA models for reliability forecasting and analysis. Comput. Ind. Eng. 1998, 35, 213–216. [Google Scholar] [CrossRef]
Kumar, U.; Jain, V.K. ARIMA forecasting of ambient air pollutants (O₃, NO, NO₂ and CO). Stoch. Environ. Res. Risk Assess. 2010, 24, 751–760. [Google Scholar] [CrossRef]
Liu, H.; Shi, J.; Erdem, E. Prediction of wind speed time series using modified Taylor Kriging method. Energy 2010, 35, 4870–4879. [Google Scholar] [CrossRef]
Tseng, F.-M.; Yu, H.-C.; Tzeng, G.-H. Combining neural network model with seasonal time series ARIMA model. Technol. Forecast. Soc. Change 2002, 69, 71–87. [Google Scholar] [CrossRef]
Tsoi, A.C. Recurrent neural network architectures: An overview. In Adaptive Processing of Sequences and Data Structures: International Summer School on Neural Networks “E.R. Caianiello”, Vietri sul Mare, Salerno, Italy, September 6-13, 1997, Tutorial Lectures; Giles, C.L., Gori, M., Eds.; Springer: Berlin/Heidelberg, Germany, 1998; pp. 1–26. ISBN 978-3-540-69752-7. [Google Scholar]
Salehinejad, H.; Sankar, S.; Barfett, J.; Colak, E.; Valaee, S. Recent Advances in Recurrent Neural Networks. arXiv 2018, arXiv:1801.01078. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
De Ville, B. Decision trees. Wiley Interdiscip. Rev. Comput. Stat. 2013, 5, 448–455. [Google Scholar] [CrossRef]
Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2005, 26, 217–222. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Alberoni, P.P.; Andersson, T.; Mezzasalma, P.; Michelson, D.B.; Nanni, S. Use of the vertical reflectivity profile for identification of anomalous propagation. Meteorol. Appl. 2001, 8, 257–266. [Google Scholar] [CrossRef]
Bech, J.; Sairouni, A.; Codina, B.; Lorente, J.; Bebbington, D. Weather radar anaprop conditions at a Mediterranean coastal site. Phys. Chem. Earth Part B Hydrol. Ocean. Atmos. 2000, 25, 829–832. [Google Scholar] [CrossRef]
Ferreira, A.P.; Nieto, R.; Gimeno, L. Completeness of radiosonde humidity observations based on the Integrated Global Radiosonde Archive. Earth Syst. Sci. Data 2019, 11, 603–627. [Google Scholar] [CrossRef]
Jiménez, Á.B.; Lázaro, J.L.; Dorronsoro, J.R. Finding Optimal Model Parameters by Discrete Grid Search. In Innovations in Hybrid Intelligent Systems; Corchado, E., Corchado, J.M., Abraham, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 120–127. ISBN 978-3-540-74972-1. [Google Scholar]
Rodriguez, J.D.; Perez, A.; Lozano, J.A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 569–575. [Google Scholar] [CrossRef] [PubMed]
Jung, Y. Multiple predicting K-fold cross-validation for model selection. J. Nonparametr. Stat. 2018, 30, 197–215. [Google Scholar] [CrossRef]

Figure 1. The unit structure of RNN.

Figure 2. The unit structure of GRU.

Figure 4. Tropospheric electromagnetic wave refraction.

Figure 5. Comparison of vertical resolution in two data sets.

Figure 6. Deep learning process of atmospheric parameter estimation.

Figure 7. The comparison between the predicted value and the real value of the temperature at 1000 hPA in the whole dataset model.

Figure 8. Temperature prediction effect diagram. Where (a) is the temperature prediction diagram at 1000 hPa, (b) is the temperature prediction diagram at 925 hPa.

Figure 9. Height prediction effect diagram. Where (a) is the height prediction diagram at 1000 hPa, (b) is the height prediction diagram at 925 hPa.

Figure 10. Vapor Pressure prediction effect diagram. Where (a) is the vapor pressure prediction diagram at 1000 hPa, (b) is the vapor pressure prediction diagram at 925 hPa.

Figure 11. Wind Speed prediction effect diagram. Where (a) is the wind speed prediction diagram at 1000 hPa, (b) is the wind speed prediction diagram at 925 hPa.

Figure 12. Wind Direction prediction effect diagram. Where (a) is the wind direction prediction diagram at 1000 hPa, (b) is the wind direction prediction diagram at 925 hPa.

Figure 13. The temperature prediction residual at 1000 hPa. The red dashed line represents y = 0, indicating the level where the model predictions perfectly match the actual observations.

Figure 14. Residual graph effect: (a) The frequency histogram of the residuals at 1000 hPa illustrates the distribution of prediction errors. (b) The Q-Q plot compares the quantiles of the residuals to the quantiles of a normal distribution, the red line represents the theoretical normal distribution line.

Figure 15. Cross validation analysis. In the figure, ‘3, 16’ represents the decision trees with a number of 16 and a depth of 3, respectively. Others are similar. The orange dashed box indicates where the model scores the highest and performs the best.

Figure 16. Random forest performance analysis diagram. Schemes follow another format. (a) The Confusion Matrix of Random; (b) The Area Under the Curve (AUC) of Random.

Figure 17. KS Curve.

Figure 18. The prediction results of the shortened dataset for the temperature at 1000 hPa.

Table 1. The R² score of each variable in the training set, validation set and test set at 1000 hPa.

R²-Score	Temp	Height	Vapor Pressure	Wind Speed	Wind Direction
Training Sets	0.9152	0.8623	0.8144	0.5913	0.6164
Validation Set	0.9052	0.8891	0.7871	0.5763	0.5613
Testing Set	0.9066	0.8704	0.7881	0.5422	0.5107

Table 2. The error of the predictive variables in the test set of the three-layer GRU model.

Pressure	Temp	Height	Vapor Pressure	Wind Speed	Wind Direction
hPa	°C	m	hPa	m/s	°
hPa	MSE	MSE	MSE	MSE	MAPE
1000	0.53869	161.15085	5.87945	5.10126	28.35029
925	0.95047	147.43068	5.17165	10.12768	32.91381
850	1.63513	130.11270	8.59465	9.77982	44.31677
700	1.62421	126.85946	5.10126	8.17118	56.19095

Table 3. The variable prediction error of GRU model and LSTM model at 1000 hPa.

Variant	Single		Double		Treble
Variant	GRU	LSTM	GRU	LSTM	GRU	LSTM
Temperature/°C	0.53094	0.54151	0.54204	0.59616	0.51201	0.64376
Vapor Pressure/hPa	6.56092	7.32616	6.24601	7.21120	6.18203	7.22746
Height/m	166.82834	189.10083	167.64525	209.29814	171.20314	223.69974
Wind Speed/m/s	6.15613	6.46087	6.32554	6.58230	6.12651	6.48521
Wind Direction/°	29.44751	31.05189	30.62275	33.78993	32.11548	30.13900
Time/s	6.58	7.51	11.80	15.02	16.65	21.64

Table 4. The prediction accuracy of three RF models.

Model	Situation	Precision	Recall	F1-Score	Support
50–300 m	Duct Events	0.83	0.84	0.84	114
	Without Duct	0.95	0.94	0.94	331
	Accuracy	0.92			445
300–800 m	Duct Events	0.34	0.75	0.47	55
	Without Duct	0.96	0.80	0.87	390
	Accuracy	0.79			445
800–1500 m	Duct Events	0.42	0.75	0.54	76
	Without Duct	0.94	0.78	0.85	369
	Accuracy	0.78			445

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, Y.; Guo, L.; Li, J.; Yu, Z.; Sun, S.; Xu, T.; Zhao, H.; Guo, L. Hybrid GRU–Random Forest Model for Accurate Atmospheric Duct Detection with Incomplete Sounding Data. Remote Sens. 2024, 16, 4308. https://doi.org/10.3390/rs16224308

AMA Style

Yan Y, Guo L, Li J, Yu Z, Sun S, Xu T, Zhao H, Guo L. Hybrid GRU–Random Forest Model for Accurate Atmospheric Duct Detection with Incomplete Sounding Data. Remote Sensing. 2024; 16(22):4308. https://doi.org/10.3390/rs16224308

Chicago/Turabian Style

Yan, Yi, Linjing Guo, Jiangting Li, Zhouxiang Yu, Shuji Sun, Tong Xu, Haisheng Zhao, and Lixin Guo. 2024. "Hybrid GRU–Random Forest Model for Accurate Atmospheric Duct Detection with Incomplete Sounding Data" Remote Sensing 16, no. 22: 4308. https://doi.org/10.3390/rs16224308

APA Style

Yan, Y., Guo, L., Li, J., Yu, Z., Sun, S., Xu, T., Zhao, H., & Guo, L. (2024). Hybrid GRU–Random Forest Model for Accurate Atmospheric Duct Detection with Incomplete Sounding Data. Remote Sensing, 16(22), 4308. https://doi.org/10.3390/rs16224308

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hybrid GRU–Random Forest Model for Accurate Atmospheric Duct Detection with Incomplete Sounding Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Neural Networks for Time Series

2.2. Random Forest

2.3. Atmospheric Duct Discrimination

3. Results and Analysis

3.1. Data

3.2. Prediction of Atmospheric Parameters

3.3. Atmospheric Duct Prediction Based on Random Forest

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI