[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
Design of Shadow Filter Using Low-Voltage Multiple-Input Operational Transconductance Amplifiers
Next Article in Special Issue
Dynamic Classifier Auditing by Unsupervised Anomaly Detection Methods: An Application in Packaging Industry Predictive Maintenance
Previous Article in Journal
Synergistic Chemical Modification and Physical Adsorption for the Efficient Curing of Soluble Phosphorus/Fluorine in Phosphogypsum
Previous Article in Special Issue
Vocal Communication Between Cobots and Humans to Enhance Productivity and Safety: Review and Discussion
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Integrated Stacking Ensemble Model for Natural Gas Purchase Prediction Incorporating Multiple Features

1
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, China
2
Chengdu Pidu District Xingneng Natural Gas Co., Ltd., Pidu District, Chengdu 611730, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(2), 778; https://doi.org/10.3390/app15020778
Submission received: 18 December 2024 / Revised: 5 January 2025 / Accepted: 9 January 2025 / Published: 14 January 2025
(This article belongs to the Special Issue Artificial Intelligence Applications in Industry)

Abstract

:
Accurate prediction of natural gas purchase volumes is crucial for both the economy and the environment. It not only facilitates the rational allocation of resources for companies but also helps to reduce operational costs. Although existing prediction methods have achieved some success in addressing the nonlinear relationships in natural gas purchases, there remains potential for further improvement. To address this issue, a stacking ensemble learning model was developed to enhance the ability to handle complex nonlinear problems. This model integrates diverse algorithms and incorporates weather factors, while regionalizing characteristics of natural gas usage, thereby achieving accurate forecasts of natural gas purchase volumes. We selected three distinctly different base models—Informer, multiple linear regression (MLR), and support vector regression (SVR)—for our research. By conducting four different feature combination experiments for each base model, including weather, time, regional, and usage features, we constructed 12 foundational models. Subsequently, we integrated these base models using a meta-learner to form the final stacking ensemble model. The experimental results indicate that the stacking ensemble model outperforms individual models across key metrics, including R 2 , MRE, and RMSE. Notably, the R 2 values improved by 4–15% compared to the 12 base models. The model was subsequently applied to predict natural gas purchase volumes in Pi County, Chengdu, China. In November 2024, a side-by-side comparison of the predicted and actual data revealed a maximum error of just 5.39%. This exceptional accuracy effectively meets forecasting requirements, underscoring the model’s predictive strength in the energy sector.

1. Introduction

Natural gas, a clean and efficient energy source, plays a significant role in the global energy consumption structure. Widely utilized in power generation, heating, and industrial applications, it is a cornerstone of modern energy systems. According to the World Energy Statistics Yearbook 2023, natural gas consumption has increased significantly in North America, with notable demand growth in the industrial and power sectors of the United States and Canada. In China, despite a slight decline in 2022, total consumption still reached 358 billion cubic meters, highlighting its essential role in supporting energy demand and economic growth. Accurate forecasting of gas purchases is vital for effective gas supply chain management, enabling suppliers to plan procurement schedules effectively and maintain efficient operations to meet growing demand. In Europe, concerns over supply disruptions and record-high prices have led to a notable decrease in natural gas consumption, underscoring the importance of precise forecasting to balance supply and demand, avoid shortages, and minimize costs. High-accuracy forecasts of gas purchases not only help companies stockpile gas in anticipation of peak demand but also optimize procurement plans, minimize economic losses, and enhance customer satisfaction [1]. As natural gas continues to grow in importance within the global energy landscape, and as regional and economic challenges intensify, the need for innovative solutions to achieve accurate purchase forecasts has become more urgent than ever.
In the field of natural gas demand forecasting, researchers have employed various methods to improve prediction accuracy and reliability. Econometric models use statistical techniques to establish relationships between natural gas demand and influencing factors, such as GDP or prices [2]. However, such methods require a substantial amount of data support and often struggle to handle nonlinear relationships. Time series models forecast future demand by analyzing historical data trends. Common methods include the statistical decomposition model, autoregressive integrated moving average (ARIMA), and seasonal autoregressive integrated moving average (SARIMA), which perform well in short-term forecasting [3,4]. Artificial neural networks (ANN) train nonlinear prediction models using historical data, with common approaches including two-stage systems and multilayer perceptron (MLP). These methods excel at capturing complex nonlinear relationships [5,6]. However, they require extensive experimentation to identify a suitable network structure, and intricate training is necessary to achieve high prediction accuracy [7]. Ensemble learning models, such as methods combining ANN, ANFIS, and traditional regression models, leverage the advantages of multiple models [8]. Currently, two ensemble learning methods, boosting and bagging, are primarily employed in natural gas demand forecasting. The stacking method, another ensemble learning approach, has the capability to mine and express highly complex nonlinear relationships. However, it has not been widely applied in natural gas demand forecasting.
To address the issues identified in the existing natural gas purchase prediction model, this research introduces a new stacking ensemble prediction method for natural gas purchases that integrate multiple features. The characteristics of this model are mainly divided into three aspects: (1) Feature enhancement. For regional features, natural gas usage patterns are first categorized based on geographical characteristics. Procurement scenarios for different industries—such as commercial, industrial, and residential gas consumption—are then further delineated, providing a more detailed understanding of demand. Additionally, weather and temporal features are incorporated as basic features to account for seasonal variations and medium-term demand fluctuations [9]. These enriched features significantly enhance the model’s ability to capture complex nonlinear relationships, which are critical for accurately predicting natural gas demand. (2) Model selection. To enhance predictive accuracy from various perspectives, three prediction models with distinct characteristics were selected: Informer, MLR, and SVR. Each model leverages unique strengths—for example, Informer excels at processing long-term time series data, while MLR and SVR effectively model linear and nonlinear relationships. By combining these models with different feature sets, 12 basic models were constructed, enabling the comprehensive exploration of feature contributions. (3) Ensemble modeling. A stacking ensemble method was developed using the 12 basic models as base learners. This method aggregates predictions from multiple models, capturing hidden feature interactions and improving the overall performance. By leveraging the strengths of different models and features, the stacking ensemble method effectively addresses complex nonlinear relationships and minimizes prediction errors.
An analysis of actual data from a specific region in Chengdu, China, was conducted to compare the prediction results of individual models with those of the stacked ensemble models. The stacked ensemble method demonstrated superior performance and strong generalization capabilities. The main contributions are as follows:
  • Pioneering application of stacking ensemble: This study marks the first application of the stacking ensemble method to natural gas procurement forecasting. By integrating the strengths of 12 base models derived from Informer, MLR, and SVR backbone models, this approach significantly improved the R 2 value by at least four percentage points.
  • Fine-grained feature integration: Segmenting natural gas demand by regional and usage characteristics enhanced the R 2 value of the three backbone models by approximately 1–15 percentage points, providing more accurate inputs for the stacking ensemble model.
  • Enhanced seasonality modeling: Adding weather factors (e.g., maximum and minimum temperatures) significantly improved the model’s response to seasonal changes and medium-term demand fluctuations, resulting in an R 2 improvement of approximately 5–10 percentage points.
These innovations not only improve prediction accuracy but also hold significant implications for optimizing natural gas procurement processes and minimizing operational costs.
The rest of this study is organized as follows: Section 2 reviews related literature. Section 3 describes the datasets, as well as the construction of the backbone models and the stacking ensemble model. Section 4 presents and discusses the experimental results and conducts future trend predictions. Section 5 concludes the research findings and outlines directions for future research.

2. Related Work

2.1. Models Related to Natural Gas Prediction

In the field of natural gas demand forecasting, researchers have adopted various methods to improve prediction accuracy and reliability. These methods include models that establish relationships between natural gas demand and influencing factors through statistical and econometric methods, as well as time series models that predict future demand by analyzing historical data trends. Statistical and econometric methods can effectively explain the driving factors behind demand changes. For example, Shahbaz et al. [2], in their study of Pakistan, found that the elasticity of GDP on natural gas demand is significant, while the price elasticity is relatively small. Similarly, Ahmad et al. [10], in their study of Pakistan, found that per capita natural gas demand is positively correlated with per capita real income, the average real price of natural gas, and the real price of substitutes. Time series models, on the other hand, primarily rely on the continuity of historical data to predict future demand trends. For instance, Sanchez-Ubeda et al. [11] proposed a new statistical decomposition model that can capture the demand patterns of industrial end-users’ natural gas consumption. Nasr et al. [12] used the ARIMA to predict Turkey’s natural gas demand and found that this model performs well in short-term forecasting.
Artificial neural networks (ANNs) train nonlinear prediction models using historical data, capable of capturing complex nonlinear relationships. For example, Khotanzad et al. [6] used a two-stage system in which the first stage consisted of three different ANN predictors, while the second stage utilized a nonlinear combination predictor to forecast natural gas demand. Szoplik et al. [8] used an MLP-trained ANN to predict natural gas consumption for residents in Szczecin, Poland, demonstrating its advantages in capturing complex nonlinear relationships. To address the limitations of individual models, Azadeh et al. [7] proposed a method combining ANN, ANFIS, and traditional regression models for long-term natural gas demand forecasting. Khan et al. [13] further improved the accuracy of natural gas demand forecasting in Pakistan by using a combination of multiple models. These hybrid methods leverage the strengths of different models, thereby enhancing the accuracy and reliability of predictions.
Despite the widespread use of these methods in natural gas demand forecasting, each has its own limitations. Some methods struggle to handle nonlinear relationships adequately, while others fail to effectively capture changes in market and economic conditions. Additionally, certain methods are complex to implement and require large amounts of data. Overall, these methods often find it challenging to address the complexities of nonlinear relationships and multiple features simultaneously, making it difficult to achieve sufficiently accurate and robust predictions in complex market environments.

2.2. Ensemble Methods

Ensemble methods improve the overall performance of models by combining the predictions of multiple base learners. These methods have been widely applied and extensively studied in the field of natural gas demand forecasting [14]. The primary ensemble methods include boosting, bagging, and stacking.
Boosting combines multiple weak learners to form a strong learner by continuously adjusting the weights of the samples, ensuring that subsequent weak learners focus more on the samples misclassified by previous weak learners [15,16]. Svoboda et al. [17] applied the boosting method (AdaBoost) to enhance the accuracy of short-term natural gas consumption forecasting, demonstrating outstanding performance, particularly with complex short-term consumption data. Bagging, in contrast, is an ensemble method that trains multiple base learners using several resampled training sets [18,19]. Meira et al. [20] implemented bagging with random forest to reduce the variance of the natural gas demand forecasting model and improve its accuracy, outperforming individual models.
Stacking has demonstrated superior integration capabilities and predictive performance in various fields, such as electricity load forecasting [21]. However, an extensive literature search reveals that no research has yet applied the stacking ensemble to natural gas demand forecasting.
In natural gas forecasting, the integration strategies of learners in ensemble learning mainly include simple averaging, weighted averaging, and voting methods (such as majority voting and weighted voting) [22,23]. Papageorgiou et al. [24] employed combination methods to aggregate the predictions of base models, including time series models, regression models, and neural networks, to enhance overall natural gas consumption forecasting performance, consistently outperforming individual models.
It is generally believed that bagging primarily improves performance by integrating predictive models obtained from different perspectives. Boosting, in contrast, focuses on integrating a series of models by identifying and correcting failed predictions to achieve better prediction accuracy. Stacking, a third approach, derives the final result through secondary learning of the base learner’s outputs, making it particularly effective for problems where nonlinearity is challenging to express.

2.3. Feature Engineering

In existing research on natural gas demand forecasting, feature engineering has been widely applied to various prediction tasks, such as energy demand forecasting and economic indicator prediction. Among these tasks, the selection of features is particularly crucial. By choosing the features that best reflect the actual situation, it is possible to more accurately capture patterns and trends in the data.
Yang et al. [25] introduced regional features and proposed an enhanced compound neural network (AComNN) to capture the time-zone-dependent context in cross-regional financial time series, achieving a 13.36% improvement in prediction accuracy compared to the baseline model. Cheng et al. [26] incorporated user consumption patterns and proposed a day-ahead probabilistic residential load forecasting method based on CNN-SE and micrometeorological data, significantly improving the feasibility and accuracy of predictions. The impact of weather features on predictions has also been demonstrated in the literature. For instance, Wang et al. [27] improved solar power forecasting performance by reconstructing machine learning inputs through PV analytical modeling to explore key weather factors.
However, in the field of natural gas demand forecasting, commonly selected features include per capita GDP, population growth rate, and natural gas prices [13,28,29]. The use of weather features, particularly regional and consumption usage features, has rarely studied.

3. Materials and Methods

This section provides detailed information about the data and methods used in this study.

3.1. Data

The data for this study were obtained from a natural gas limited liability company in a district of Chengdu, covering the period from January 2016 to October 2023. These data include monthly records of natural gas purchases and sales. Table 1 presents the detailed statistics of these datasets.
For the weather data, Python’s ’requests’ library was used to send HTTP requests to a weather website and scrape the daily temperature data for a district in Chengdu from 1 January 2016, to 30 October 2023. This dataset included the daily maximum and minimum temperatures. The daily data were then grouped by month, and the average maximum and minimum temperatures for each month were calculated. Additionally, dates were converted into time features such as the year and month to capture the seasonal and cyclical trends in the time series.
After data cleaning and transformation, the data visualization is shown in Figure 1. The chart illustrates the natural gas sales volumes across various usage types and regions from 2016 to 2023.
Training and testing set division: To ensure the model’s generalization ability, the dataset was divided into training and testing sets according to the time series. A total of 70% of the data were used for the training set, and 30% were used for the testing set. Specifically, the data from 2016 to 2020 were used for the training set, while the data from 2021 to 2023 were used for the testing set.
Prediction horizon and input window: The past 24 months of historical data were utilized to forecast natural gas purchase volumes for the subsequent 3 months. This 3-month-ahead forecasting approach was implemented in a rolling manner, with each prediction step incorporating updated actual values from the previous step as input. This configuration enabled the model to capture both seasonal patterns and medium-term trends, ensuring robust performance across different time periods.
Therefore, a correlation analysis was performed to identify key features that significantly affect the target variable and to uncover high correlations among features, thereby avoiding redundancy in the model [30], This analysis is illustrated in Figure 2 and Figure 3.
Figure 2 presents the correlation matrix between regional features (natural gas sales volumes in each region), basic features (year, month, minimum temperature, and maximum temperature), and gas purchase volume. The analysis highlights the basis for feature selection and the relationships among regions. Among the basic features, the year shows a moderate positive correlation with gas purchase volume (r = 0.56), indicating an overall increase in gas demand over time. Minimum and maximum temperatures show strong negative correlations (r = −0.67 and r = −0.69, respectively), reflecting higher gas demand during colder periods for heating. For regional features, sales volumes in Hongguang, Pixian, and Xipu exhibit high positive correlations with gas purchase volume (r = 0.73, r = 0.78, and r = 0.73, respectively), indicating their predictive significance. In contrast, Anjing and Qiaosong have lower correlations (r = 0 and r = 0.43, respectively), suggesting limited contributions. Based on these findings, the selected features include year, month, minimum temperature, maximum temperature, Hongguang, Pixian, and Xipu to enhance predictive performance while minimizing redundancy.
Figure 3 presents the correlation matrix between usage type features (total sales volume, residential, commercial, schools, etc.), basic features, and gas purchase volume. Among the usage type features, total sales volume exhibits the highest correlation with gas purchase volume (r = 0.97), making it the most significant predictor. Although Residential also shows a high correlation (r = 0.82), its strong correlation with total sales volume (r = 0.97) indicates redundancy. Therefore, total sales volume was selected as the preferred feature over Residential.
For other features, Industrial (r = 0.68) and Collective (r = 0.66) were chosen due to their relatively strong correlations and distinct patterns. Features like Commercial (r = 0.53) and Schools (r = 0.38) were excluded due to their weaker correlations. Consequently, the selected usage type features are total sales volume, Industrial, and Collective, capturing the primary drivers of natural gas demand while minimizing redundancy.

3.2. Backbone Models and Construction of Basic Natural Gas Purchase Forecasting Models

In this section, we strategically selected the Informer, MLR, and SVR models as our backbone models, and upon each of these, we constructed four additional foundational models, each utilizing a distinct combination of features. The choice of the Informer model is due to its unparalleled efficiency and accuracy in handling long-term time series data. The MLR model was chosen for its simplicity and interpretability, making it ideal for forecasting linear relationships. The SVR model was leveraged for its capability to handle nonlinear relationships and its robust generalization abilities. By integrating these three fundamentally different backbone models, our aim is to capture the intricate dynamics of natural gas purchase volumes from a more comprehensive perspective, ensuring a multifaceted approach to prediction.

3.2.1. Informer and Its Four Base Model Constructions

The Informer aims to address the efficiency and accuracy issues present in the traditional Transformer, thereby enabling long-sequence time series forecasting. The core innovation of the Informer lies in its ability to process long sequences through self-attention and sparse attention mechanisms, significantly reducing computational complexity while maintaining high performance. The Informer employs a sparse self-attention mechanism to selectively focus on important parts of the input sequence, thereby reducing the computational overhead compared to the full attention mechanism used in standard Transformer [31]. This selective attention is achieved through two main strategies [32], namely the following:
1.
ProbSparse attention: By focusing on the most critical parts of the sequence, the quadratic complexity of traditional self-attention is reduced to logarithmic complexity. It is achieved by selecting the top U queries based on their impact, significantly speeding up the processing.
2.
Distilling operation: The distilling operation reduces the sequence length by halving it at each layer while retaining key information, thereby decreasing the dimensionality of the sequence.
This hierarchical structure enables the model to efficiently handle long sequences.
In this study, four base models were constructed using the Informer to capture the impact of different feature combinations on the prediction of natural gas purchase volumes. The specific models are as follows:
1.
INF: This model utilizes only basic features for prediction. The model is formulated as follows:
Y ^ I N F = Informer ( X b a s i c )
where X b a s i c represents the matrix of basic features, which includes time features (year, month) and temperature features (monthly average minimum and maximum temperatures). This can be expressed by the following formula:
X b a s i c = X t i m e + X t e m p e r a t u r e
2.
INF-RB: Based on the basic features, regional features (natural gas sales volumes in different regions) are added. The model is formulated as follows:
Y ^ I N F R B = Informer ( X b a s i c + X r e g i o n )
where X r e g i o n represents the matrix of regional features.
3.
INF-UB: Based on the basic features, usage features (such as commercial, industrial, residential, etc.) are added. The model is formulated as follows:
Y ^ I N F U B = Informer ( X b a s i c + X u s a g e )
where X u s a g e represents the matrix of usage features.
4.
INF-RB-UB: Combining basic features, regional features, and usage features, this model comprehensively considers all features for prediction. The model is formulated as follows:
Y ^ I N F R B U B = Informer ( X b a s i c + X r e g i o n + X u s a g e )
The same feature combinations are applied to the MLR and SVR.
Table 2 shows the parameters used for training and predicting with the Informer model.
The Informer was trained using the Adam optimizer with a learning rate of 0.0001 and a batch size of 4. The training process spanned 5 epochs, with early stopping criteria based on validation loss.

3.2.2. MLR and Its Four Base Model Constructions

Multiple linear regression (MLR) is a classic regression analysis method that establishes a linear relationship between independent variables (features) and a dependent variable (target value) for making predictions. The advantage of MLR is its interpretability, as it clearly demonstrates the impact of each feature on the target value [33]. The underlying principle is to identify a set of regression coefficients that minimize the residual sum of squares between the predicted and actual values. The basic formula is as follows:
Y = β 0 + β 1 x 1 + β 2 x 2 + + β n x n + ω
where Y denotes the dependent variable, x 1 , x 2 , , x n denotes the independent variables, β 0 denotes the intercept term, β 1 , β 2 , , β n denote the regression coefficients, and ω denotes the error term.
Similarly, four base models were constructed for MLR to evaluate the impacts of different feature combinations. The specifics are as follows:
1.
MLR:
Y ^ M L R = β 0 + β b a s i c X b a s i c
where β b a s i c denotes the regression coefficients corresponding to the basic features, and | β b a s i c | = 4 .
2.
MLR-RB:
Y ^ M L R R B = β 0 + β b a s i c X b a s i c + β r e g i o n X r e g i o n
where β r e g i o n denotes the regression coefficients corresponding to the regional features, and | β r e g i o n | = 5 .
3.
MLR-UB:
Y ^ M L R U B = β 0 + β b a s i c X b a s i c + β u s a g e X u s a g e
where β u s a g e denotes the regression coefficients corresponding to the usage features, and | β u s a g e | = 6 .
4.
MLR-RB-UB:
Y ^ M L R R B U B = β 0 + β b a s i c X b a s i c + β r e g i o n X r e g i o n + β u s a g e X u s a g e
The LinearRegression implementation from the sklearn library in Python was used to construct the multiple linear regression (MLR) models. The parameters for the model are shown in Table 3.

3.2.3. SVR and Its Four Base Model Constructions

SVR is a regression method based on the support vector machine (SVM) principle, aiming to find a function that minimizes the deviation within a certain interval for the training dataset while also minimizing model complexity [34]. SVR introduces the ϵ -insensitive loss function, which only considers the training samples whose deviations exceed ϵ , thereby constructing a robust regression model [35].
The goal of the SVR is to find a function f ( x ) that keeps the deviation within ϵ while ensuring smoothness. Specifically, SVR solves the following optimization problem:
min w , b 1 2 w 2 + C i = 1 n ( ξ i + ξ i )
S.t.
y i ( w · x i + b ) ϵ + ξ i ( w · x i + b ) y i ϵ + ξ i ξ i , ξ i 0
where w denotes the weight vector, b denotes the bias term, C denotes the regularization parameter, and ξ i and ξ i denote slack variables. ϵ denotes the tolerance of the ϵ -insensitive loss function.
The SVR can be transformed into the following form through the Lagrange multipliers α i and α i :
Y ^ S V R = i = 1 n ( α i α i ) K ( x i , x ) + b
where K ( x i , x ) denotes the kernel function, n represents the number of support vectors, and b denotes the bias term of the model.
For SVR, four base models were developed to assess the impacts of various feature combinations, as detailed below:
1.
SVR:
Y ^ S V R = i = 1 n ( α i α i ) K ( X b a s i c , x ) + b
2.
SVR-RB:
Y ^ S V R R B = i = 1 n ( α i α i ) K ( X b a s i c + X r e g i o n , x ) + b
3.
SVR-UB:
Y ^ S V R U B = i = 1 n ( α i α i ) K ( X b a s i c + X u s a g e , x ) + b
4.
SVR-UB-RB:
Y ^ S V R R B U B = i = 1 n ( α i α i ) K ( X b a s i c + X r e g i o n + X u s a g e , x ) + b
The SVR implementation from the sklearn library in Python was used to construct the support vector regression models. The parameters for the model are shown in Table 4.

3.3. Stacking Ensemble

Stacking ensemble involves constructing models using different learning algorithms and then training a combiner algorithm to make final predictions based on the outputs of the base algorithms. This combiner can be any ensemble technique [36].
In this study, a stacking ensemble approach was implemented, incorporating three backbone models (Informer, MLR, and SVR) with various feature combinations, resulting in twelve base models (INF, INF-RB, INF-UB, INF-RB-UB, MLR, MLR-RB, MLR-UB, MLR-RB-UB, SVR, SVR-RB, SVR-UB, SVR-RB-UB), and one top-level model (MLR).
MLR was selected as the meta-learner for our stacking ensemble approach because of its computational efficiency, ease of interpretation, and its role in mitigating overfitting risks. In contrast to more intricate models, MLR offers transparent regression coefficients, which explicitly reveal the influence of each base model on the ultimate forecast. This clarity enhances our understanding of the prediction process. Furthermore, the straightforward nature of MLR is particularly beneficial in scenarios with smaller sample sizes, as it helps to maintain the model’s stability and its ability to generalize, thereby ensuring robust and reliable predictions from our stacked model. The architecture of the scheme used is depicted in Figure 4.
The stacking ensemble model consists of two layers: a base learner layer and a meta-learner layer. The working principles of each layer are outlined using mathematical formulas.
Base learner layer: Let the training dataset be represented as ( X , Y ) , where X is divided into X b a s i c , X r e g i o n , X u s a g e as input feature matrices, and Y denotes the output target. Twelve different base learners were selected for training. Each base learner f m ( X i ) was trained on the training set to produce predictions Y ^ m .
Y ^ m = f ( X i ) , m = I N F , I N F R B , , S V R R B U B
where X i { X b a s i c , X b a s i c + X r e g i o n , X b a s i c + X u s a g e , X b a s i c + X r e g i o n + X u s a g e } .
These predictions from the base learners are combined to form a new feature matrix Z, where each column represents the prediction of one base learner:
Z = [ Y ^ I N F , Y ^ I N F R B , , Y ^ S V R R B U B ]
Meta-learner layer: The meta-learner uses the prediction results Z from the base learners as input and the true target Y for training. The meta-learner g ( Z ) is trained to learn how to best combine the outputs of the base learners to generate the final prediction Y ^ .
Y ^ = g ( Z ) = g ( [ Y ^ I N F , Y ^ I N F R B , , Y ^ S V R R B U B ] )
Specifically, the meta-learner is a multiple linear regression model:
Y ^ = β 0 + β 1 Y ^ I N F + β 2 Y ^ I N F R B + + β 12 Y ^ S V R R B U B
where β 0 denotes the intercept, and β 1 , β 2 , , β 12 are the regression coefficients, determined by minimizing the least squares error function.
The detailed steps for stacking prediction are as follows:
1.
Data preprocessing: Split the original dataset ( X , Y ) into the training set ( X t r a i n , Y t r a i n ) and test set ( X t e s t , Y t e s t ) ;
2.
Base learner training: Train each base learner on the training set to generate predictions Y ^ m , t r a i n . Simultaneously, perform predictions on the test set to generate Y ^ m , t e s t ;
3.
Secondary training set generation: Combine the predictions of the base learners on the training set to form a new feature matrix Z t r a i n . Similarly, combine the predictions on the test set to form Z t e s t ;
4.
Meta-learner training: Train the meta-learner g ( Z t r a i n ) using the secondary training set Z t r a i n ;
5.
Final prediction: Use the trained meta-learner to make predictions on the secondary test set feature matrix Z t e s t , generating the final prediction Y ^ f i n a l .
By combining the predictions of multiple base learners and using a meta-learner to aggregate these results, the stacking ensemble model enhances the overall prediction accuracy of the model [37].

4. Experimental Results

This section presents and discusses the experimental results. Specifically, the base models with different feature combinations are compared to the three backbone models. Informer, MLR, and SVR have been chosen as the comparison methods in this study due to their outstanding performance and broad application in long-time series prediction, linear regression, and nonlinear regression tasks. Additionally, the prediction results of the stacking ensemble model are compared with those of the 12 base models to evaluate the effects of different feature combinations and the ensemble method.

4.1. Evaluation Metrics

The models were evaluated using the dataset described in Section 3.1, with the main conclusions drawn based on these metrics. To comprehensively assess the performance of the ensemble scheme and the basic methods, six commonly used regression evaluation metrics were employed: mean relative error (MRE), mean absolute error (MAE), symmetric mean absolute percentage error (SMAPE), coefficient of determination ( R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE), defined as follows [38]:
M R E = 1 n i = 1 n y i y ^ i y i
M A E = 1 n i = 1 n y i y ^ i
S M A P E = 1 n i = 1 n y i y ^ i y i + y ^ i / 2
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y ¯ 2
R M S E = 1 n i = 1 n y i y ^ i 2
M A P E = 1 n i = 1 n y i y ^ i y i × 100 %
where y i represents the actual value, y ^ i denotes the predicted value, y ¯ represents the average of the actual values, and n denotes the sample size. MRE reflects the degree of deviation of the predicted value relative to the actual value and is a dimensionless indicator. MAE ranges from 0 to + , indicating the absolute difference between the predicted value and the actual value, with smaller values indicating higher prediction accuracy of the model. SMAPE combines the absolute values of the predicted and actual values, providing a more reasonable reflection of prediction errors. R 2 ranges from 0 to 1, with values closer to 1 indicating a stronger explanatory power of the model for the data. RMSE provides the square root of the average squared prediction errors, with smaller values indicating higher prediction accuracy of the model. In comparison to MAE, RMSE is more sensitive to large errors. MAPE expresses the accuracy as a percentage and is useful for comparing the performance of different models on the same dataset [39].

4.2. Experimental Results Analysis

During the model construction process, comparative experiments were conducted to evaluate the impact of incorporating weather features on the predictive performance of the models. The experiments were carried out using the Informer (INF), SVR, and MLR models, comparing the performance of models with and without weather features added to the baseline features.
As shown in Table 5, the inclusion of weather features led to a significant improvement in the R 2 values of the models, increasing them by 5 to 10 percentage points. This improvement can be attributed to the strong influence of weather on natural gas demand. The correlations between minimum and maximum temperatures and gas purchase volume were notably negative ( r = 0.67 and r = 0.69 , respectively), indicating that lower temperatures drive higher heating demand. These patterns cannot be fully explained by time-based features (e.g., year and month) alone. Therefore, weather features were included by default in all subsequent experiments.
For each model (Informer, MLR, SVR), four experiments were conducted: one without using regional and usage features (INF), one with regional features (INF-RB), one with usage features (INF-UB), and one with both regional and usage features (INF-RB-UB). The approach was applied consistently across MLR and SVR. This objective aims to evaluate the impact of different feature combinations on the model’s predictive performance and identify the optimal feature combination scheme.
Next, the prediction results of the Informer, MLR, and SVR models will be presented and analyzed in detail.

4.2.1. Informer Experimental Results

In the performance evaluation of the Informer (see Table 6 and Figure 5), the INF-RB-UB combination exhibits the best prediction results, with an R 2 of 76.77%. This significantly outperforms the INF combination, which lacks regional and usage features and achieves an R 2 of only 70.46%. This improvement highlights the substantial impact of incorporating regional and usage features on the model’s explanatory power and prediction accuracy. Furthermore, the INF-RB-UB combination achieves lower MRE and SMAPE values of 0.0789 and 0.0811, respectively, further validating the effectiveness of this feature combination in reducing prediction errors.
Figure 5 illustrates the comparison between the predicted values of the four base models of INF and the actual natural gas purchase volumes. The data clearly exhibit significant seasonal patterns, with natural gas demand peaking in the winter and declining in the summer, reflecting increased heating needs and low-demand periods, respectively. Overall, the models perform better during low-demand periods compared to high-demand periods. Models incorporating regional and usage features demonstrate better alignment with actual values, particularly in capturing demand peaks. Similar patterns are also observed in MLR and SVR models, which exhibit consistent behavior across seasonal variations. For instance, the feature combination in the INF-RB model enables it to better track the substantial increase in demand caused by low temperatures.

4.2.2. MLR Experimental Results

The performance of the MLR model with various feature combinations further validates this conclusion (see Table 7 and Figure 6). Specifically, the MLR-RB-UB combination stands out as the top performer among all feature combinations, achieving an R 2 value of 82.76%, which is significantly higher than the MLR combination without additional features, which has an R 2 value of only 67.06%. The MLR-RB-UB combination also excels in other evaluation metrics. For instance, the MRE for the MLR-RB-UB combination is 0.823, compared to 0.1150 for the MLR combination; the MAE is 943,053 versus 1,297,328; and the SMAPE is 0.0835 compared to 0.1111, all indicating a significant reduction in error. The MLR-RB and MLR-UB combinations also show improvements across all metrics. For example, compared to the baseline of 67.06%, the R 2 values increased by 15.67% and 10.30% respectively, and compared to the baseline MAPE of 11.50%, there was a decrease of 3.27% and 1.91% respectively, with the MLR-RB combination slightly outperforming the MLR-UB combination. These figures clearly demonstrate the substantial enhancement in predictive performance when the appropriate feature combinations are introduced.

4.2.3. SVR Experimental Results

The outcomes of the SVR model across different feature combinations are encapsulated in Table 8 and visualized in Figure 7. The SVR-RB model, which integrates regional features, demonstrates significant improvements across all metrics. The MRE is reduced to 0.0702, the MAE decreases to 817,575, the SMAPE is lowered to 0.0718, the R 2 increases to 88.60%, and the RMSE drops to 990,176, outperforming all other feature combinations. Incorporating usage-based features into the SVR-UB model also yields positive results, although these enhancements are not as pronounced as those achieved by the SVR-RB model. The SVR-RB-UB model, which amalgamates both regional and usage features, does see some uptick in performance, yet it still lags behind the SVR-RB model in terms of overall effectiveness. This variance in performance can be attributed to the added complexity and potential noise that come with an expanded feature set. It appears that while feature integration can be beneficial, the judicious selection and combination of features truly drive predictive power.
Overall, the predictive performance of the models—whether Informer, MLR, or SVR—significantly improves with the inclusion of regional and usage features. The inclusion of these features not only boosts the R 2 values, indicating stronger explanatory power of the models, but also substantially reduces metrics such as MAE, MRE, SMAPE, RMSE, and MAPE, thereby decreasing prediction errors.
These results clearly demonstrate the importance of feature engineering in enhancing the performance of natural gas purchase volume prediction models.

4.2.4. Stacking Experimental Results

The stacking ensemble model was constructed by integrating the results of four experiments, each under different feature combinations for the three backbone models (Informer, MLR, SVR), resulting in a total of 12 base models. Stacking effectively captures feature relationships and patterns in the data by using the prediction results of the base models (i.e., the outputs of the first layer) as new input features for the second layer meta-learner. This approach enables the effective combination of the strengths of multiple models, thereby enhancing the overall prediction accuracy. The performance evaluation and prediction results of the stacking ensemble model are shown in Table 9 and Figure 8.
In Table 9, the performance evaluation results of the stacking ensemble model are compared with the optimal feature combinations ( R 2 ) of the INF, MLR, and SVR. The stacking ensemble model outperforms the base models across all metrics.
The stacking ensemble model exhibits lower mean relative error (MRE), mean absolute error (MAE), symmetric mean absolute percentage error (SMAPE), and mean absolute percentage error (MAPE). This indicates that the stacking ensemble model has higher accuracy and stability in predicting natural gas purchase volumes, particularly in managing relative errors. Furthermore, the stacking model achieved an R 2 of 92.57%, which is about 4–15 percentage points higher than the R 2 values of the base models, demonstrating its superior ability to explain the variability in the data. Additionally, the root mean square error (RMSE) of the stacking model is 799,635, the lowest among all models, indicating the best performance in terms of the square root of the prediction errors and providing the most stable and accurate results.
Figure 8 illustrates the comparison between the predicted values and the actual values using the stacking ensemble model. As seen in the figure, the prediction results of the stacking ensemble model are very close to the actual values, especially in terms of trend changes and fluctuations.
These results validate the superiority of the stacking model in the task of predicting natural gas purchase volumes. Compared to individual base models, the stacking model demonstrates higher accuracy across all evaluation metrics, particularly in key indicators such as R 2 , MRE, MAE, RMSE, and MAPE.

4.3. Engineering Application

To further validate the performance of the stacking ensemble model in terms of real-world prediction and generalization capabilities, the model was applied to the Pi County area of Chengdu, China. In October 2023, it predicted the natural gas procurement volumes for this area from November 2023 to October 2024. When writing this paper in November 2024, we compared the predicted values with the actual procurement volumes during this period, with the specific details presented in Table 10.
Table 10 clearly demonstrates that our stacking ensemble model has exhibited outstanding performance in predicting natural gas procurement volumes. In particular, in March 2024, the model’s performance reached its peak, with the gap between the predicted and actual values being negligible, with an error rate of only 0.35%, which is almost imperceptible. However, in certain months, such as November 2023, and April, May, and October 2024, we noticed an increase in prediction errors, reaching 5.39%, 4.84%, 5.37%, and 5.03%, respectively. These deviations may have been due to the economic slowdown that began in 2024, which led to a reduction in commercial production and a decrease in consumer spending, thereby affecting fluctuations in natural gas demand [40]. Despite these challenges, our stacking model has maintained a high level of prediction accuracy and stability. The errors for all months were kept within 6%, a result that fully attests to the model’s powerful capability and reliability in predicting natural gas purchase volumes. Even in a volatile economic environment, our model continues to provide precise predictions, offering valuable reference data for decision-makers.

5. Conclusions

The accurate prediction of natural gas purchase volumes is critical for optimizing resource allocation, improving decision-making processes, and ensuring efficient supply chain management. This study introduces a novel approach by integrating fine-grained features, such as regional and usage attributes, into a stacking ensemble framework to address the challenges of modeling complex nonlinear relationships. The proposed method combines three backbone models—Informer, MLR, and SVR—within a stacking ensemble framework, consisting of 12 base learners, to effectively leverage weather, regional, and usage features.
Experimental results demonstrate that incorporating these features significantly improves prediction accuracy. The stacking ensemble model consistently outperformed the individual base models across all evaluation metrics, including R 2 , MAE, MRE, SMAPE, and MAPE. For example, the R 2 values of the Informer, MLR, and SVR showed notable improvements after integrating weather, regional, and usage features, with R 2 values improving by 4%–15% compared to the 12 base models. Additionally, the stacking ensemble model maintained a reasonable prediction error, with the maximum error validated at 5.39% from November 2023 to October 2024, even under potential economic downturns. These results highlight its robust generalization capabilities and practical applicability, demonstrating the effectiveness of integrating fine-grained features and ensemble learning techniques for accurate natural gas purchase volume prediction.
Although the stacking ensemble method demonstrated excellent predictive performance, several limitations must be acknowledged. First, due to the inclusion of 12 base models, the model exhibits relatively high computational complexity, which may restrict its applicability to larger datasets or real-time forecasting scenarios. Future research could leverage distributed training or parallel computing techniques to improve the model’s computational efficiency and scalability. Second, since the data used in this study were collected exclusively from Pi County, Chengdu, the model’s applicability to other geographical regions or economic conditions has yet to be validated. Expanding the dataset and testing the model in diverse contexts will be essential to confirm its generalizability.
In addition, while this study successfully mitigated the risk of overfitting through techniques such as cross-validation and L2 regularization, further optimization of the model architecture is necessary to strike a balance between complexity and performance, preventing potential overfitting on other datasets. Future research will also explore more advanced ensemble learning techniques (e.g., XGBoost and LightGBM) and conduct comparative analyses with other nonlinear algorithms to comprehensively evaluate their performance in modeling complex nonlinear relationships. Lastly, plans are in place to incorporate additional external influencing factors, such as economic indicators and population changes, to further enhance predictive accuracy and provide robust, efficient decision-making support for natural gas procurement.

Author Contributions

Investigation, Y.L., Q.Y. and Y.B.; methodology and software, J.W. and L.J.; writing—original draft, J.W., L.Z., Y.L. and Q.Y.; supervision and writing—review and editing, L.J.; Experiments and data analysis, J.W., L.Z., Y.L., Q.Y. and Y.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy concerns.

Conflicts of Interest

Authors Le Zhang, Yaqi Liu, Qihong Yu and Yuheng Bu were employed by the company Chengdu Pidu District Xingneng Natural Gas Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Di Bella, G.; Flanagan, M.; Foda, K.; Maslova, S.; Pienkowski, A.; Stuermer, M.; Toscani, F. Natural gas in europe: The potential impact of disruptions to supply. Energy Econ. 2024, 138, 107777. [Google Scholar] [CrossRef]
  2. Shahbaz, M.; Lean, H.H.; Farooq, A. Natural gas consumption and economic growth in pakistan. Renew. Sustain. Energy Rev. 2013, 18, 87–94. [Google Scholar] [CrossRef]
  3. Lim, B.; Zohren, S. Time-series forecasting with deep learning: A survey. Philos. Trans. R. Soc. 2021, 379, 20200209. [Google Scholar] [CrossRef]
  4. Schaffer, A.L.; Dobbins, T.A.; Pearson, S.-A. Interrupted time series analysis using autoregressive integrated moving average (arima) models: A guide for evaluating large-scale health interventions. BMC Med. Res. Methodol. 2021, 21, 58. [Google Scholar] [CrossRef]
  5. Khotanzad, A.; Elragal, H. Natural gas load forecasting with combination of adaptive neural networks. In Proceedings of the IJCNN’99, International Joint Conference on Neural Networks, Proceedings (Cat. No. 99CH36339), Washington, DC, USA, 10–16 July 1999; Volume 6, pp. 4069–4072. [Google Scholar]
  6. Szoplik, J. Forecasting of natural gas consumption with artificial neural networks. Energy 2015, 85, 208–220. [Google Scholar] [CrossRef]
  7. Galván, E.; Mooney, P. Neuroevolution in deep neural networks: Current trends and future challenges. IEEE Trans. Artif. 2021, 2, 476–493. [Google Scholar] [CrossRef]
  8. Azadeh, A.; Asadzadeh, S.; Mirseraji, G.; Saberi, M. An emotional learning-neuro-fuzzy inference approach for optimum training and forecasting of gas consumption estimation models with cognitive data. Technol. Forecast. Soc. Change 2015, 91, 47–63. [Google Scholar] [CrossRef]
  9. Ding, J.; Zhao, Y.; Jin, J. Forecasting natural gas consumption with multiple seasonal patterns. Appl. Energy 2023, 337, 120911. [Google Scholar] [CrossRef]
  10. Khan, M.A.; Ahmad, U. Energy demand in pakistan: A disaggregate analysis. Pak. Dev. Rev. 2008, 47, 437–455. [Google Scholar]
  11. Sánchez-Úbeda, E.F.; Berzosa, A. Modeling and forecasting industrial end-use natural gas consumption. Energy Econ. 2007, 29, 710–742. [Google Scholar] [CrossRef]
  12. Nasr, G.; Badr, E.; Joun, C. Backpropagation neural networks for modeling gasoline consumption. Energy Convers. Manag. 2003, 44, 893–905. [Google Scholar] [CrossRef]
  13. Khan, M.A. Modelling and forecasting the demand for natural gas in pakistan. Renew. Sustain. Energy Rev. 2015, 49, 1145–1159. [Google Scholar] [CrossRef]
  14. Zhao, H.; Zhou, Z.; Zhang, P. Forecasting of the short-term electricity load based on woa-bilstm. Int. J. Pattern Recognition Artif. Intell. 2023, 37, 2359018. [Google Scholar] [CrossRef]
  15. Zhang, Y.; Liu, J.; Shen, W. A review of ensemble learning algorithms used in remote sensing applications. Appl. Sci. 2022, 12, 8654. [Google Scholar] [CrossRef]
  16. Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
  17. Svoboda, R.; Kotik, V.; Platos, J. Short-term natural gas consumption forecasting from long-term data collection. Energy 2021, 218, 119430. [Google Scholar] [CrossRef]
  18. Altman, N.; Krzywinski, M. Ensemble methods: Bagging and random forests. Nat. Methods 2017, 14, 933–935. [Google Scholar] [CrossRef]
  19. Ngo, G.; Beard, R.; Chandra, R. Evolutionary bagging for ensemble learning. Neurocomputing 2022, 510, 1–14. [Google Scholar] [CrossRef]
  20. Meira, E.; Oliveira, F.L.C.; de Menezes, L.M. Forecasting natural gas consumption using bagging and modified regularization techniques. Energy Econ. 2022, 106, 105760. [Google Scholar] [CrossRef]
  21. Divina, F.; Gilson, A.; Goméz-Vela, F.; Torres, M.G.; Torres, J.F. Stacking ensemble learning for short-term electricity consumption forecasting. Energies 2018, 11, 949. [Google Scholar] [CrossRef]
  22. Zhou, Z.-H. Ensemble methods. Combining Pattern Classifiers; Wiley: Hoboken, NJ, USA, 2014; pp. 186–229. [Google Scholar]
  23. Zhou, Z.-H. Ensemble Methods: Foundations and Algorithms; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
  24. Papageorgiou, K.I.; Poczeta, K.; Papageorgiou, E.; Gerogiannis, V.C.; Stamoulis, G. Exploring an ensemble of methods that combines fuzzy cognitive maps and neural networks in solving the time series prediction problem of gas consumption in greece. Algorithms 2019, 12, 235. [Google Scholar] [CrossRef]
  25. Yang, Z.; Keung, J.; Kabir, M.A.; Yu, X.; Tang, Y.; Zhang, M.; Feng, S. Acomnn: Attention enhanced compound neural network for financial time-series forecasting with cross-regional features. Appl. Soft Comput. 2021, 111, 107649. [Google Scholar] [CrossRef]
  26. Cheng, L.; Zang, H.; Xu, Y.; Wei, Z.; Sun, G. Probabilistic residential load forecasting based on micrometeorological data and customer consumption pattern. IEEE Trans. Power Syst. 2021, 36, 3762–3775. [Google Scholar] [CrossRef]
  27. Wang, J.; Zhong, H.; Lai, X.; Xia, Q.; Wang, Y.; Kang, C. Exploring key weather factors from analytical modeling toward improved solar power forecasting. IEEE Trans. Smart Grid 2017, 10, 1417–1427. [Google Scholar] [CrossRef]
  28. Wadud, Z.; Dey, H.S.; Kabir, M.A.; Khan, S.I. Modeling and forecasting natural gas demand in bangladesh. Energy Policy 2011, 39, 7372–7380. [Google Scholar] [CrossRef]
  29. Tong, M.; Qin, F.; Dong, J. Natural gas consumption forecasting using an optimized grey bernoulli model: The case of the world’s top three natural gas consumers. Eng. Appl. Artif. Intell. 2023, 122, 106005. [Google Scholar] [CrossRef]
  30. Omuya, E.O.; Okeyo, G.O.; Kimwele, M.W. Feature selection for classification using principal component analysis and information gain. Expert Syst. Appl. 2021, 174, 114765. [Google Scholar] [CrossRef]
  31. Zhou, W.; Liu, C.; Yuan, P.; Jiang, L. An undersampling method approaching the ideal classification boundary for imbalance problems. Appl. Sci. 2024, 14, 5421. [Google Scholar] [CrossRef]
  32. Shao, P.; Zheng, B.; Tang, X.; Chen, C.; Hou, X. Diagnostic method for demagnetization fault of elevator synchronous traction machine based on informer. Int. J. Pattern Recognit. Artif. 2024. [Google Scholar] [CrossRef]
  33. Shams, S.R.; Jahani, A.; Kalantary, S.; Moeinaddini, M.; Khorasani, N. The evaluation on artificial neural networks (ann) and multiple linear regressions (mlr) models for predicting so2 concentration. Urban Clim. 2021, 37, 100837. [Google Scholar] [CrossRef]
  34. Sun, Y.; Ding, S.; Zhang, Z.; Jia, W. An improved grid search algorithm to optimize svr for prediction. Soft Comput. 2021, 25, 5633–5644. [Google Scholar] [CrossRef]
  35. Ma, C.; Zhai, X.; Wang, Z.; Tian, M.; Yu, Q.; Liu, L.; Liu, H.; Wang, H.; Yang, X. State of health prediction for lithium-ion batteries using multiple-view feature fusion and support vector regression ensemble. Int. Mach. Learn. Cybern. 2019, 10, 2269–2282. [Google Scholar] [CrossRef]
  36. Lee, T.; Kim, J.-H.; Lee, S.-J.; Ryu, S.-K.; Joo, B.-C. Improvement of concrete crack segmentation performance using stacking ensemble learning. Appl. Sci. 2023, 13, 2367. [Google Scholar] [CrossRef]
  37. Abdellatif, A.; Mubarak, H.; Ahmad, S.; Ahmed, T.; Shafiullah, G.; Hammoudeh, A.; Abdellatef, H.; Rahman, M.; Gheni, H.M. Forecasting photovoltaic power generation with a stacking ensemble model. Sustainability 2022, 14, 11083. [Google Scholar] [CrossRef]
  38. Erickson, B.J.; Kitamura, F. Magician’s corner: 9. performance metrics for machine learning models. Radiol. Artif. Intell. 2021, 3, e200126. [Google Scholar] [CrossRef]
  39. Naser, M.; Alavi, A.H. Error metrics and performance fitness indicators for artificial intelligence and machine learning in engineering and sciences. Archit. Struct. Constr. 2023, 3, 499–517. [Google Scholar] [CrossRef]
  40. Xu, G.; Chen, Y.; Yang, M.; Li, S.; Marma, K.J.S. An outlook analysis on china’s natural gas consumption forecast by 2035: Applying a seasonal forecasting method. Energy 2023, 284, 128602. [Google Scholar] [CrossRef]
Figure 1. Data visualization.
Figure 1. Data visualization.
Applsci 15 00778 g001
Figure 2. Correlation matrix of regional features, basic features, and gas purchase volume.
Figure 2. Correlation matrix of regional features, basic features, and gas purchase volume.
Applsci 15 00778 g002
Figure 3. Correlation matrix of usage types, basic features, and gas purchase volume.
Figure 3. Correlation matrix of usage types, basic features, and gas purchase volume.
Applsci 15 00778 g003
Figure 4. Stacking ensemble model architecture.
Figure 4. Stacking ensemble model architecture.
Applsci 15 00778 g004
Figure 5. Comparison of actual and predicted gas purchase volumes using the INF model with different feature combinations. (a) INF; (b) INF-RB; (c) INF-UB; (d) INF-RB-UB.
Figure 5. Comparison of actual and predicted gas purchase volumes using the INF model with different feature combinations. (a) INF; (b) INF-RB; (c) INF-UB; (d) INF-RB-UB.
Applsci 15 00778 g005
Figure 6. Comparison of actual and predicted gas purchase volumes using the MLR model with different feature combinations. (a) MLR; (b) MLR-RB; (c) MLR-UB; (d) MLR-RB-UB.
Figure 6. Comparison of actual and predicted gas purchase volumes using the MLR model with different feature combinations. (a) MLR; (b) MLR-RB; (c) MLR-UB; (d) MLR-RB-UB.
Applsci 15 00778 g006
Figure 7. Comparison of actual and predicted gas purchase volumes using the SVR model with different feature combinations. (a) SVR; (b) SVR-RB; (c) SVR-UB; and (d) SVR-RB-UB.
Figure 7. Comparison of actual and predicted gas purchase volumes using the SVR model with different feature combinations. (a) SVR; (b) SVR-RB; (c) SVR-UB; and (d) SVR-RB-UB.
Applsci 15 00778 g007
Figure 8. Comparison of actual and predicted gas purchase volumes using stacking.
Figure 8. Comparison of actual and predicted gas purchase volumes using stacking.
Applsci 15 00778 g008
Table 1. Data and sources.
Table 1. Data and sources.
DataData ContentData Source
2016–2023 Monthly PurchaseNatural gas purchaseNatural Gas LLC,
Volume Statisticsvolume ( m 3 )Chengdu District
2016–2023 Monthly Sales VolumeAnjing, Hongguang, Pixian,Natural Gas LLC,
Statistics (detailed by region)Qiaosong, Xipu ( m 3 )Chengdu District
2016–2023 Monthly Sales VolumeSchool, Commercial, Industrial,Natural Gas LLC,
Statistics (detailed by usage)Residential, Collective ( m 3 )Chengdu District
2016–2023 MonthlyMax temperature,Weather Station
Weather DataMin temperature (C)
Table 2. Informer model training parameters.
Table 2. Informer model training parameters.
ParameterDescriptionValue
seq_lenLength of the input sequence12
label_lenLength of the label sequence for prediction6
pred_lenLength of the output prediction sequence3
d_modelDimension of the input embedding512
n_headsNumber of attention heads8
e_layersNumber of encoder layers2
d_layersNumber of decoder layers1
d_ffDimension of the feed-forward network2048
dropoutDropout rate0.05
activationActivation functiongelu
Table 3. MLR model training parameters.
Table 3. MLR model training parameters.
ParameterDescriptionValue
fit_interceptWhether to calculate the interceptTrue
normalizeWhether to normalize the input dataFalse
copy_XWhether to copy the input data or overwrite itTrue
Table 4. SVR model training parameters.
Table 4. SVR model training parameters.
ParameterDescriptionValue
kernelType of kernel function‘rbf’ (Radial Basis Function)
CRegularization parameter 10 0 , 10 1 , 10 2 , 10 3
gammaRBF kernel parameternp.logspace (−2, 2, 5)
Table 5. Performance comparison of models with and without weather features ( R 2 , %).
Table 5. Performance comparison of models with and without weather features ( R 2 , %).
ApproachWith Weather Features ( R 2 , %)Without Weather Features ( R 2 , %)
INF70.4665.15
SVR67.0657.29
MLR79.1871.83
Table 6. Performance evaluation results of INF.
Table 6. Performance evaluation results of INF.
ApproachMREMAESMAPE R 2 (%)RMSEMAPE (%)
INF0.08511,094,1380.088170.461,593,9798.51
INF-RB0.09571,117,4350.101377.191,400,5249.57
INF-UB0.08461,056,1060.086176.341,426,4368.46
INF-RB-UB0.07891,000,4620.081176.771,413,5917.89
Table 7. Performance evaluation results of MLR.
Table 7. Performance evaluation results of MLR.
ApproachMREMAESMAPE R 2 (%)RMSEMAPE (%)
MLR0.11501,297,3280.111167.061,683,19311.50
MLR-RB0.0823943,8180.083582.731,218,6678.23
MLR-UB0.09591,106,8530.095377.361,395,3859.59
MLR-RB-UB0.0823943,0530.083582.761,217,7388.23
Table 8. Performance evaluation results of SVR.
Table 8. Performance evaluation results of SVR.
ApproachMREMAESMAPE R 2 (%)RMSEMAPE (%)
SVR0.09761,074,1060.095779.181,338,1969.76
SVR-RB0.0702817,5750.071888.60990,1767.02
SVR-UB0.09781,121,8880.101880.301,301,7189.78
SVR-RB-UB0.0787895,5980.079983.311,198,1707.87
Table 9. Performance comparison of optimal feature combinations in base methods and the stacking model.
Table 9. Performance comparison of optimal feature combinations in base methods and the stacking model.
ApproachMREMAESMAPE R 2 (%)RMSEMAPE (%)
INF-RB0.09571,117,4350.101377.191,400,5249.75
MLR-RB-UB0.0823943,0530.083582.761,217,7388.23
SVR-RB0.0702817,5750.071888.60990,1767.02
Stacking0.0614648,3030.060192.57799,6356.14
Table 10. Comparison of future predictions and actual values.
Table 10. Comparison of future predictions and actual values.
DateStacking Predicted ValueActual ValuePercentage Error (%)
2023.1111,259,02110,683,6485.39
2023.1215,951,52315,226,7304.76
2024.0116,701,12517,109,964−2.39
2024.0214,225,39113,843,6892.76
2024.0313,317,30313,270,3790.35
2024.0410,375,3019,896,2624.84
2024.0510,199,9729,680,5925.37
2024.069,148,2068,778,6364.21
2024.078,987,7148,687,6483.45
2024.088,189,3937,973,5482.71
2024.098,836,9449,127,189−3.18
2024.1010,923,21010,400,0865.03
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Jiang, L.; Zhang, L.; Liu, Y.; Yu, Q.; Bu, Y. An Integrated Stacking Ensemble Model for Natural Gas Purchase Prediction Incorporating Multiple Features. Appl. Sci. 2025, 15, 778. https://doi.org/10.3390/app15020778

AMA Style

Wang J, Jiang L, Zhang L, Liu Y, Yu Q, Bu Y. An Integrated Stacking Ensemble Model for Natural Gas Purchase Prediction Incorporating Multiple Features. Applied Sciences. 2025; 15(2):778. https://doi.org/10.3390/app15020778

Chicago/Turabian Style

Wang, Junjie, Lei Jiang, Le Zhang, Yaqi Liu, Qihong Yu, and Yuheng Bu. 2025. "An Integrated Stacking Ensemble Model for Natural Gas Purchase Prediction Incorporating Multiple Features" Applied Sciences 15, no. 2: 778. https://doi.org/10.3390/app15020778

APA Style

Wang, J., Jiang, L., Zhang, L., Liu, Y., Yu, Q., & Bu, Y. (2025). An Integrated Stacking Ensemble Model for Natural Gas Purchase Prediction Incorporating Multiple Features. Applied Sciences, 15(2), 778. https://doi.org/10.3390/app15020778

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop