[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
Emergency Capability Evaluation of Port-Adjacent Oil Storage and Transportation Bases: An Improved Analytic Hierarchy Process Approach
Previous Article in Journal
The Causal Nexus Among Energy Dependency, Human Capital, and Renewable Energy: An Empirical Analysis for EU Members
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predictive Modeling of the Hydrate Formation Temperature in Highly Pressurized Natural Gas Pipelines

Department of Chemical Engineering, Gebze Technical University, 41420 Gebze, Turkey
*
Author to whom correspondence should be addressed.
Energies 2024, 17(21), 5306; https://doi.org/10.3390/en17215306
Submission received: 14 June 2024 / Revised: 24 July 2024 / Accepted: 31 July 2024 / Published: 25 October 2024
(This article belongs to the Section H: Geo-Energy)

Abstract

:
In this study, we aim to develop advanced machine learning regression models for the prediction of hydrate temperature based on the chemical composition of sweet gas mixtures. Data were collected in accordance with the BOTAS Gas Network Code specifications, approved by the Turkish Energy Market Regulatory Authority (EMRA), and generated using DNV GasVLe v3.10 software, which predicts the phase behavior and properties of hydrocarbon-based mixtures under various pressure and temperature conditions. We employed linear regression, decision tree regression, random forest regression, generalized additive models, and artificial neural networks to create prediction models for hydrate formation temperature (HFT). The performance of these models was evaluated using the hold-out cross-validation technique to ensure unbiased results. This study demonstrates the efficacy of ensemble learning methods, particularly random forest with an R2 and Adj. R2 of 0.998, for predicting hydrate formation conditions, thereby enhancing the safety and efficiency of gas transport and processing. This research illustrates the potential of machine learning techniques in advancing the predictive accuracy for hydrate formations in natural gas pipelines and suggests avenues for future optimizations through hybrid modeling approaches.

1. Introduction

Natural gas will be the most used fuel in the 21st century for three reasons, at least until humans figure out how to produce hydrogen for fuel cells inexpensively. First of all, compared to oil or coal, gas burns more cleanly, creates less pollutants, and produces less carbon dioxide overall. Secondly, natural gas liquids are the input of the petrochemical industry. Lastly, the breakdown of the supply chain during the COVID-19 pandemic period and the U.S. sanctions imposed on Russia due to the Russian war in Ukraine hit the natural gas prices in Europe, increasing them up to 338 euro/MWh on the 26 August of 2022, being the highest price on the Title Transfer Facility (TTF)—the main virtual reference point for gas trading in Europe ever reported—which is based in Amsterdam, The Netherlands [1]. Therefore, we will need to explore natural gas from harsher and even farther locations as we deplete proven gas reserves.
During production activities, it is more likely to encounter unusual conditions. When ‘guest’ molecules like carbon dioxide or methane interact with water at ambient temperatures (usually less than 25 °C) and modest pressures (more than 6 bar), clathrate hydrates are typically formed. Single tiny (<0.9 nm) guest molecules in these non-stoichiometric hydrates are molecularly encaged by hydrogen-bonded water cavities.
The generation of a hydrate is highly frequent because water is constantly available during the production phase of gas or misconstruction of the pipeline. Hydrates are a flow assurance issue at multiphase transfer pipelines, surface facilities, and wellheads. The hydrate prevention methods should be applied successfully and promptly because of the fact that production can occasionally stop for many days, even months, due to pipeline plugging caused by hydrate formation. This may cause a large economic impact on a scale of millions of dollars in gas gathering system or highly pressurized natural gas transmission pipelines. Hydrate formation is most likely to occur at high pressures and low temperatures. Hydrate structures are available in three different types: SI, SII, and SH [2].
The different hydrate structures along with the corresponding types of cages are displayed in Figure 1. Structure I contains two different types of cages, a large 51,262 tetrakaidekahedral cage and a small 512 pentagonal dodecahedral cage [3]. Structure II also consists of the small 512 cages. Structure H consists of a small 512 cage, a medium 43, 56, 63 cage and a wide 51,268 icosahedral cage. For instance, CH4 can fit into both big SI cages and small ones, while C3H8 (propane) can only form hydrates within larger SII cage structures as the smaller SI cages cannot accommodate it due to its size. This is because the size of the guest molecule plays a significant role in determining the resulting hydrate structure [4,5].
It seems impracticable to try to use experimental methods to determine the hydrate formation pressure (HFP) and temperature (HFT) for any specific gas mixture. Therefore, finding a systematic approach for precise hydrate formation prediction is essential. The following are the well-known papers in the literature on investigation of hydrate formation prediction: Hammerschmidt [6], K-value methods and Katz gravity [7,8], Wichert and Baillie [9], Mann [10], Makogon [11], Berg [12], Kobayashi [13], Motiee [14], Østergaard [15], Mokhatab and Towler [16], Vuthaluru and Bahadori [17] Safamirzaei [18], and Salufu [19].
All of the aforementioned authors suffer from failing to discriminate between various hydration types. Even a very small quantity of an SII structure hydrate former contributes to the majority of the SII structure in the hydrate phase.
Although there is a structural difference, HFT and HFP are significantly impacted by this. It might be the main factor behind why these correlations were unsuccessful in forecasting HFT and HFP for natural gas mixtures [20].
Table 1 presents calculated hydrate formation temperatures (HFTs) for various natural gas compositions under specific pressure (P) and specific gravity (SG) conditions. The data suggest that there appears to be a correlation between an increase in SG and the hydrate formation temperature. To state it in another way, the gases with higher specific gravity, typically containing heavier hydrocarbons, tend to form hydrates at higher temperatures compared to lighter gases. However, the relationship between SG and HFT is not strictly linear. Other factors like gas composition variations can cause up to a 27 °C deviation in the HFT. The impact of SG on the hydrate formation temperature also depends on pressure. At higher pressures, the effect of SG seems less pronounced, with other factors like water content playing a more dominant role in determining the hydrate formation temperature. Therefore, as artificial intelligence has developed, new alternatives to HFT and HFP prediction have emerged in order to solve this challenging problem encountered in the oil and gas industry in recent years. Machine learning regression models, in particular, have attracted attention for their ability to forecast the formation of hydrates [21].
In Table 2, our results are compared with the results within the literature. Our research stands out from previous studies by using a significantly larger dataset and implementing cutting-edge algorithms. Furthermore, we have assessed the performance of these advanced algorithms against both one another and against traditional regression methods.
The precision and error rates of our models fall within acceptable parameters, indicating that they can be effectively utilized to predict hydrate formation temperature, as detailed in our analysis.
In this study, we intend to develop machine learning regression models for the prediction of the hydrate formation pressure (HFP) and hydrate formation temperature (HFT) based on the different chemical compositions of a sweet gas mixture. For this purpose, a dataset (as seen in Table 3) listing the gas quality specifications of the BOTAS Gas Network Code on Transmission System approved by the Turkish Energy Market Regulatory Authority (EMRA) was collected using DNV GasVLe software v3.10, which can be utilized to generate thermodynamic data in order to forecast the phase behavior and properties of both simple and complicated hydrocarbon-based mixtures over a broad range of temperatures and pressures. Following that, four regression methods, i.e., decision tree model (DTM), generalized additive model (GAM), random forest model (RFM), and linear model (LM) were used to build prediction models for the previously listed outputs. Also, the performance of the suggested model was further assessed using the R2, Adj:R2, the hold-out cross-validation technique and RMSE, and N-RMSE metrics to guarantee unbiased outcomes. Following the discussion about the sensitivity and validity of the suggested methods, we did not compare the obtained results with thermodynamic models and existing correlations regarding the HFT. This is because these correlations are unsuccessful due to their calculation methods being mainly based on specific gravity.

2. Materials and Methods

2.1. Collection of Data

Table 4 contains statistical summaries of various chemical components and physical properties relevant to a study on predicting hydrate formation temperatures in highly pressurized natural gas pipelines, based on a dataset of 203,820 observations. Since gas composition and operating pressure have a significant impact on the gas hydrate, which has been a challenging issue for nearly 50 years, a large sample size is used for the prediction model to improve the machine learning process. Each row in the table corresponds to different components or properties of natural gas or the conditions within the pipeline, such as nitrogen (N2), carbon dioxide (CO2), methane (CH4), ethane (C2H6), propane (C3H8), alkanes (nC4 to nC9), and water content (H2O)—which is crucial for hydrate formation—critical temperature (Tc), and pressure (Pc), which are important thermodynamic properties.
The columns contain the minimum, maximum, mean, and standard deviation values for each component or property. These metrics help illustrate the range of conditions from min to max, the mean, and the standard deviation considering the gas quality specification of the BOTAS Gas Network Code that is shown in Table 3. This comprehensive statistical profile provides in-depth knowledge of the behavior of the gas mixture under various conditions, which helps in the building of predictive models and facilitating effective predictions of hydrate formation in pipeline operations for maintaining flow assurance and preventing operational challenges, like blockages in natural gas pipelines, surface facilities, and wellheads.

2.2. Methodology

The gas composition of natural gas (N2, CO2, CH4, C2H6, C3H8, iC4, nC4, neoC5, nC5, iC5, nC6, nC7, nC8, nC9, H2O) and pressure (P) are utilized as inputs in order to predict the hydrate formation temperature (HFT).
A number of predicting techniques are presented, such as fitrlm, fitrensemble, fitrtree, fitrgam, and fitrnet artificial neural networks; linear regression; decision tree regression; random forest regression; and generalized additive models. It is made sure that these techniques are applied within a standardized framework and that the default settings are utilized without alterations. If a prediction model achieves an acceptable margin of error, for example, and meets the predetermined demand aim, it is effective. Depending on the operational requirements of gas gathering systems or highly pressurized natural gas transmission pipelines to avoid pipeline blockage, an RMSE of less than 1 °C for HFT predictions is deemed effective. Predictive models can be used by considering adaptation factors, such as guaranteeing data quality, relevant feature selection, proper model selection, and employing hold-out validations, as illustrated by Figure 2.

2.3. Linear Regression

Linear regression is a statistical technique that is applied to model the connection between one or more independent variables and a dependent variable by fitting a linear equation to the observed data. A multiple linear regression model’s equation is typically written as the equation below [Equation (1)].
y ^ = β 0 + x 1 * β 1 + x 2 * β 2 + x n * β n
where y ^ is the predicted output; x1, x2, …, xn represent the features of the dataset starting from 1 up to n; and β0, β1, …, βn are the coefficients obtained in the training phase. The goal is to minimize the sum of squared residuals, which quantifies the difference between observed values and predicted values.

2.4. Decision Tree Regression

A non-parametric supervised learning technique for classification and regression applications is the decision tree. In order to create branches that represent the decision rules, it divides the data into subsets according to the input feature values. Every leaf node in the tree represents an outcome, every branch in the tree represents a decision rule, and every node in the tree represents a characteristic (or attribute) [29]. Choosing the best feature to divide the data at each node of the decision tree typically using criteria like entropy, Gini impurity, or variance reduction (as in equation below [Equation (2)]) is a step in the construction process.
G a i n ( C , A = S C v ϵ A | C V | C S ( C v )
where A is the attribute, C is the cluster set, and S(C) is the entropy of set C.
Recursively splitting the data until a stopping criterion is satisfied, such as a maximum tree depth or a minimum number of samples per leaf, causes the tree to grow.

2.5. Random Forest Regression

Several decision trees are constructed using the random forest ensemble learning technique, which then combines the findings to increase the accuracy and reduce overfitting. During training, it creates a large number of decision trees, from which it outputs the mean prediction or the classes’ mode for each tree. By choosing a random subset of characteristics to split at each tree node and by employing random samples with a replacement to develop each tree, the approach adds unpredictability [30,31]. This process helps to assure that the individual trees are not correlated and increases the generalization and robustness of the model, as in equation below [Equation (3)].
f ¯ r f K x = 1 K k = 1 K T x k
where {T(x)}k is the output of the kth tree and K is the number of trees.

2.6. Generalized Additive Model (GAM)

Non-linear interactions between the dependent and independent variables are supported by the flexible generalization of linear models known as the generalized additive model (GAM). The equation below represents the model [Equation (4)].
g E Y = β 0 + f 1 ( x 1 ) + f 2 ( x 2 ) + + f m ( x m )
where different f variables represent smooth functions of the predictors. These functions are estimated using techniques such as spline smoothing. GAMs retain the interpretability of linear models while providing flexibility to model complex, non-linear relationships [32,33]. They are particularly useful when it is unclear how predictors and the response variable are related to one another.

3. Evaluation Metrics

3.1. Coefficient of Determination ( R 2 )

A statistical measure known as the coefficient of determination, R2, evaluates the percentage of the dependent variable’s variation that can be predicted from the independent variables, as in the equation below [Equation (5)].
R 2 = 1 i = 1 m y i y ^ i 2 i = 1 m y i y ¯ i 2
where
-
y ^ i is the predicted value.
-
y ¯ i is the mean of the actual values.
-
m is the number of observations.
-
yi is the actual value.
When the R2 number equals 1, the variability in the response data around its mean is fully explained by the model; however, when R2 equals 0, none of the variability is explained by the model.

3.2. Adjusted R 2 ( A d j . R 2 )

The R2 value for the number of predictors in the model is adjusted by the adjusted R2 value. It is employed when there are several independent variables present in order to give a more precise estimate of the quality of fit, as in the equation below [Equation (6)]
A d j .   R 2 = 1 1 R 2 * m 1 m n 1
where
-
m is the number of observations.
-
R2 is the coefficient of determination.
-
n is the number of predictors.
Because it penalizes the addition of non-significant predictors, the adjusted R2 can be more useful than R2 when comparing models with varying numbers of predictors [34].

3.3. Root Mean Square Error (RMSE)

The average magnitude of the errors between anticipated and observed values is measured by the root mean square error (RMSE), as in the equation below [Equation (7)].
R M S E = i = 1 m y i y ^ i 2 m
where
-
m is the number of observations.
-
y ^ i is the predicted value.
-
y i is the actual value.
Since the RMSE is always positive, a smaller number indicates a better fit [35].

3.4. Normalized Root Mean Square Error (N-RMSE)

By standardizing the RMSE with the range or mean of the observed data, the normalized root mean square error (N-RMSE) facilitates an easier comparison of model performance across different scales, as in the equation below [Equation (8)].
N R M S E = i = 1 m y i y ^ i 2 y ¯ i m
where
-
m is the number of observations.
-
y i is the actual value.
-
y ¯ i is the mean of the actual values.
-
y ^ i is the predicted value.
N-RMSE is a dimensionless metric that facilitates easier comparisons between various datasets [36].

3.5. Average Absolute Error (AAE)

The average magnitude of absolute errors between predicted and actual values is measured by the average absolute error (AAE), also called the mean absolute error (MAE), as in the equation below [Equation (9)].
A A E = 1 n i = 1 n x i x
where
-
n is the number of observations.
-
x is the predicted value.
-
x i is the actual value.
AAE offers an easily interpreted, straightforward metric for model accuracy; lower values correspond to greater model performance. Considering them as a whole, these measures provide a wide assessment of the model’s performance, including both the prediction accuracy and consistency.

4. Results

Observed values of the hydrate formation temperature are plotted on the x-axis of the decision tree model graph, while the predicted values are plotted on the y-axis. The red line represents the ideal y = x line where predictions perfectly match the observations. The blue points represent the test data, while the orange circles represent the training data. The slope line helps visualize the linearity of the predictions (Figure 3). The decision tree model shows predictions by closely following the y = x line, indicating good model performance with both training and test data being closely aligned. This model captures the underlying pattern in the data well but might be prone to overfitting due to the exact matches in the training set.
In the random forest model graph, the observed values and predicted values are similarly plotted, with a y = x reference line. The random forest model shows excellent alignment with the y = x line, indicating high accuracy. Both training and test data points are closely clustered around the ideal line. This model performs slightly better than the decision tree model by averaging multiple trees, which reduces overfitting and improves generalization.
In the generalized additive model graph, the same observed versus predicted value comparison is shown. The GAM shows a reasonable alignment with the y = x line, though there is more scatter compared to the random forest and decision tree models. This indicates that there is some deviation in predictions, especially at the extreme ends of the observed values. GAM captures non-linear relationships but may not fit as precisely as ensemble methods for complex datasets. It provides better interpretability of individual predictors.
In the linear model graph, the observed versus predicted values are plotted similarly. The linear model shows a significant deviation from the y = x line, especially for higher observed values. This indicates poor performance, with many predictions being far from the actual values. This model fails to capture the non-linear nature of the data. It is overly simplistic for the complex relationships in hydrate formation temperature prediction.
Both decision tree and random forest models show good performance, but random forest has a slight edge due to reduced overfitting and better generalization. Random forest’s predictions are more tightly clustered around the y = x line. GAM performs better than the LM by capturing non-linear relationships, but it still is not satisfactory compared to random forest and decision tree. The random forest model shows the best performance, with predictions closely aligning with the observed values, indicating high accuracy and reliability in predicting hydrate formation temperature in highly pressurized natural gas pipelines. These comparisons highlight the weaknesses and strengths of each model in terms of their predictive accuracy and the ability to solve complex and non-linear relationships in the data.
In Table 5, the results of the evaluation metrics for the generalized additive model (GAM), decision tree model (DTM), random forest model (RFM), and linear model (LM) are summarized. Each model’s performance is assessed with a correlation coefficient (R2), adjusted R2 metrics, average absolute error (AAE), root mean square error (RMSE), and normalized RMSE (N-RMSE).
The decision tree model performs quite well with an R2 and Adj. R2 of 0.999. The RMSE is low at 0.351, indicating minimal deviation between the actual and predicted values. An N-RMSE value of 2.047 and an AAE value of 0.129 suggest very high accuracy and precision, demonstrating how well the model fits the observed data. These measures show how well the decision tree model captures the underlying patterns in the data, despite the possibility of overfitting due to its high alignment with the training dataset.
The random forest model also exhibits strong performance with an R2 and adjusted R2 of 0.998. Its RMSE is slightly higher at 0.659 compared to the decision tree model, but this still indicates high accuracy. An N-RMSE of 11.222 and an AAE of 0.401, while higher than those of the decision tree model, still reflect good predictive capability. The random forest model benefits from averaging multiple trees, which reduces the risk of overfitting and enhances its generalization to unseen data.
The generalized additive model achieves an R2 and adjusted R2 of 0.964, showing its capability to model non-linear relationships in the data. However, its RMSE of 2.593 is higher than those of the decision tree and random forest models, indicating larger errors in predictions. It achieves an N-RMSE of 20.012 and an AAE of 1.316, which are are significantly higher, suggesting that while the GAM can handle non-linearities, it does not perform as well as the ensemble methods in this context.
The linear model demonstrates the worst performance off all the models with an R2 and adjusted R2 of 0.604. Its RMSE of 8.627 shows that its predictions are significantly inaccurate. The hydrate formation temperature data’s complexity is not adequately captured by the model, as evidenced by an N-RMSE of 33.805 and an AAE of 2.523. Considering the simplistic nature of linear regression, it is not well suited for modeling non-linear relationships in this dataset.
In conclusion, the generalized additive model (GAM) and linear model (LM) are not as effective in predicting the hydrate temperature as the random forest regression (RFR) and decision tree regression (DTR) models. The DTR model shows slightly better performance metrics than the RFR model, even if overfitting can occur more frequently. The RFR model offers a balanced accuracy. The generalized additive model (GAM), while useful for capturing non-linear relationships, is less accurate than the ensemble methods. The linear model’s performance is inadequate for this application, as mentioned before, highlighting the need for more sophisticated models to predict the hydrate formation temperature accurately in highly pressurized natural gas pipelines, surface facilities, and wellheads. On the other hand, the key finding is that the random forest regression (RFR) model provides the best predictive performance for this complex problem.

5. Conclusions

In this study, machine learning methods consisting of regression models were developed to predict the hydrate formation temperature based on a sweet gas mixture’s chemical composition. The data on necessary thermodynamics were generated by using the DNV GasVLe program, and 203,820 pieces of data were collected in accordance with the gas quality specification of the BOTAS Gas Network Code. In conclusion, the DTR and RFR models performed the best among all the models presented here; the random forest model was slightly better concerning generalization, whereas the linear model showed the worst performance off all the models, with an R2 and adjusted R2 of 0.604. Moreover, DTR also worked efficiently with a very high accuracy and an R2 and Adj. R2 of 0.999 but slightly overfitted data. The GAM and LM may prove to be good in other scenarios; however, they do not work soundly when used for the prediction of intricate relationships compared to the DTR and random forest models in this case of this study, which investigated their performance regarding the hydrate formation temperature. Results show that ensemble learning techniques, especially random forest, capture complex relationships inherent in this dataset very well and can provide accurate predictions.
Overall, this work highlights the potential of advanced machine learning methods to improve the predictability of hydrate formation conditions in natural gas pipelines, contributing to improved safety and efficiency in gas transport and processing. Future works could extend these models to broader datasets and explore hybrid approaches for further optimization.
In our upcoming research, we will broaden our approach by employing a wider range of methodologies and by utilizing a more extensive dataset. This will encompass the prediction of the hydrate formation temperature in sweet gas containing both thermodynamic and kinetic inhibitors, as well as investigating other aspects beyond temperature prediction. We are confident that machine learning can empower researchers and gas transmission and offshore production operators to develop highly precise models for hydrate formation temperatures, leveraging the diverse capabilities of various machine learning-based modeling techniques.

Author Contributions

Conceptualization, Ö.Y.; methodology, Ö.Y.; software, M.K. and Ö.Y.; validation, M.K. and Ö.Y.; formal analysis, M.K. and Ö.Y.; investigation, M.K. and Ö.Y.; resources, M.K. and Ö.Y.; data curation, M.K.; writing—original draft preparation, M.K.; writing—review and editing, M.K. and Ö.Y.; visualization, M.K. and Ö.Y.; supervision, Ö.Y.; project administration, Ö.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets generated during this study are not publicly available. However, parts of the datasets can be made available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

TTFTitle Transfer Facility
AAEAverage absolute error
DTRDecision tree regression
RFRRandom forest regression
RMSERoot mean square error
N-RMSENormalized root mean square error
MAEMean absolute error
GAMGeneralized additive model
ANNArtificial neural network
HFTHydrate formation temperature
HFPHydrate formation pressure
GMDHHybrid Group Method of Data Handling
GEPGene expression programming
SGSpecific gravity
PPressure

References

  1. Elliott, S. Europe Adapts, Not Without Difficulty, to Life without Russian Gas; S&P Global: New York, NY, USA, 2023. [Google Scholar]
  2. Sloan, E.D. Fundamental principles and applications of natural gas hydrates. Nature 2003, 426, 353–363. [Google Scholar] [CrossRef]
  3. Koh, C.A.; Sloan, E.D.; Sum, A.K.; Wu, D.T. Fundamentals and applications of gas hydrates. Annu. Rev. Chem. Biomol. Eng. 2011, 2, 237–257. [Google Scholar] [CrossRef]
  4. Hester, K.C.; Dunk, R.M.; White, S.N.; Brewer, P.; Peltzer, E.; Sloan, E. Gas hydrate measurements at Hydrate Ridge using Raman spectroscopy. Geochim. Cosmochim. Acta 2007, 71, 2947–2959. [Google Scholar] [CrossRef]
  5. Lu, H.; Seo, Y.T.; Lee, J.W.; Moudrakovski, I.; Ripmeester, J.A.; Chapman, N.R.; Coffin, R.B.; Gardner, G.; Pohlman, J. Complex gas hydrate from the Cascadia margin. Nature 2007, 445, 303–306. [Google Scholar] [CrossRef]
  6. Hammerschmidt, E. Formation of gas hydrates in natural gas transmission lines. Ind. Eng. Chem. 1934, 26, 851–855. [Google Scholar] [CrossRef]
  7. Katz, D.L. Prediction of conditions for hydrate formation in natural gases. Trans. AIME 1945, 160, 140–149. [Google Scholar]
  8. Wilcox, W.I.; Carson, D.; Katz, D. Natural gas hydrates. Ind. Eng. Chem. 1941, 33, 662–665. [Google Scholar] [CrossRef]
  9. Baillie, C.; Wichert, E. Chart gives hydrate formation temperature for natural gas. Oil Gas J. 1987, 85, 37–39. [Google Scholar]
  10. Mann, S.L. Vapor–Solid Equilibrium Ratios for Structure I and II Natural Gas Hydrates; Gas Processors Association: Tulsa, OK, USA, 1988. [Google Scholar]
  11. Makogon, I.U. Hydrates of Natural Gas; PennWell Books: Tulsa, OK, USA, 1981. [Google Scholar]
  12. Berge, B. Hydrate predictions on a microcomputer. In Petroleum Industry Application of Microcomputers; SPE: Denver, CO, USA, 1986. [Google Scholar]
  13. Kobayashi, R.; Song, K.Y.; Sloan, E.D. Phase behavior of water/hydrocarbon systems. In Petroleum Engineering Handbook; SPE: Denver, CO, USA, 1987; Volume 25, p. e13. [Google Scholar]
  14. Motiee, M. Estimate possibility of hydrates. Hydrocarb. Process. 1991, 70, 98–99. [Google Scholar]
  15. Carroll, J. Natural Gas Hydrates: A Guide for Engineers; Elsevier Science: Amsterdam, The Netherlands, 2009. [Google Scholar]
  16. Towler, B.; Mokhatab, S. Quickly estimate hydrate formation conditions in natural gases. Hydrocarb. Process. 2005, 84, 61–62. [Google Scholar]
  17. Bahadori, A.; Vuthaluru, H.B. A novel correlation for estimation of hydrate forming condition of natural gases. J. Nat. Gas Chem. 2009, 18, 453–457. [Google Scholar] [CrossRef]
  18. Safamirzaei, M. Predict Gas Hydrate Formation Temperature with a Simple Correlation. Gas Processing News, 18 September 2015. [Google Scholar]
  19. Salufu, S.O.; Nwakwo, P. New empirical correlation for predicting hydrate formation conditions. In Proceedings of the SPE Nigeria Annual International Conference and Exhibition, Society of Petroleum Engineers, Lagos, Nigeria, 30 July–1 August 2013. [Google Scholar]
  20. Mohamadi-Baghmolaei, M.; Hajizadeh, A.; Azin, R.; Izadpanah, A.A. Assessing thermodynamic models and introducing novel method for prediction of methane hydrate formation. J. Pet. Explor. Prod. Technol. 2018, 8, 1401–1412. [Google Scholar] [CrossRef]
  21. Chapoy, A.; Mohammadi, A.-H.; Richon, D. Predicting the hydrate stability zones of natural gases using artificial neural networks. Oil Gas Sci. Technol. Rev. L'IFP 2007, 62, 701–706. [Google Scholar] [CrossRef]
  22. Zahedi, G.; Karami, Z.; Yaghoobi, H. Prediction of hydrate formation temperature by both statistical models and artificial neural network approaches. Energy Convers. Manag. 2009, 50, 2052–2059. [Google Scholar] [CrossRef]
  23. Riazi, S.H.; Heydari, H.; Ahmadpour, E.; Gholami, A.; Parvizi, S. Development of novel correlation for prediction of hydrate formation temperature based on intelligent optimization algorithms. J. Nat. Gas Sci. Eng. 2014, 18, 377–384. [Google Scholar] [CrossRef]
  24. Hesami, S.M.; Dehghani, M.; Kamali, Z.; Ejraei Bakyani, A. Developing a simple-to-use predictive model for prediction of hydrate formation temperature. Int. J. Ambient. Energy 2017, 38, 380–388. [Google Scholar] [CrossRef]
  25. Lgibaly, A.A.; Elkamel, A.M. A new correlation for predicting hydrate formation conditions for various gas mixtures and inhibitors. Fluid Phase Equilibria 1998, 152, 23–42. [Google Scholar] [CrossRef]
  26. Hosseini, M.; Yuri, L. A reliable model to predict the methane-hydrate equilibrium: An updated database and machine learning approach. Renew. Sustain. Energy Rev. 2023, 173, 113103. [Google Scholar] [CrossRef]
  27. Amar, M. Prediction of hydrate formation temperature using gene expression programming. J. Nat. Gas Sci. Eng. 2021, 89, 103879. [Google Scholar] [CrossRef]
  28. Mesbah, M.; Habibnia, S.; Ahmadi, S.; Dehaghani, A.; Bayat, S. Developing a robust correlation for prediction of sweet and sour gas hydrate formation temperature. Petroleum 2022, 8, 204–209. [Google Scholar] [CrossRef]
  29. Klusowski, J.M.; Tian, P.M. Large scale prediction with decision trees. J. Am. Stat. Assoc. 2024, 119, 525–537. [Google Scholar] [CrossRef]
  30. Scornet, E.; Biau, G.; Vert, J.P. Consistency of random forests. Ann. Stat. 2015, 43, 1716–1741. [Google Scholar] [CrossRef]
  31. Criminisi, A.; Shotton, J.; Konukoglu, E. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends® Comput. Graph. Vis. 2012, 7, 81–227. [Google Scholar] [CrossRef]
  32. De Bock, K.W.; Coussement, K.; Van den Poel, D. Ensemble classification based on generalized additive models. Comput. Stat. Data Anal. 2010, 54, 1535–1546. [Google Scholar] [CrossRef]
  33. Brezger, A.; Lang, S. Generalized structured additive regression based on Bayesian P-splines. Comput. Stat. Data Anal. 2006, 50, 967–991. [Google Scholar] [CrossRef]
  34. Elmaz, F.; Yücel, Ö. Data-driven identification and model predictive control of biomass gasification process for maximum energy production. Energy 2020, 195, 117037. [Google Scholar] [CrossRef]
  35. Insel, M.A.; Yucel, O.; Sadikoglu, H. Higher heating value estimation of wastes and fuels from ultimate and proximate analysis by using artificial neural networks. Waste Manag. 2024, 185, 33–42. [Google Scholar] [CrossRef]
  36. Yaka, H.; Insel, M.A.; Yucel, O.; Sadikoglu, H. A comparison of machine learning algorithms for estimation of higher heating values of biomass and fossil fuels from ultimate analysis. Fuel 2022, 320, 123971. [Google Scholar] [CrossRef]
Figure 1. Different types of cages and hydrate structures.
Figure 1. Different types of cages and hydrate structures.
Energies 17 05306 g001
Figure 2. Flowchart of predictive modeling of hydrate formation temperature.
Figure 2. Flowchart of predictive modeling of hydrate formation temperature.
Energies 17 05306 g002
Figure 3. Predictions vs. observations plots for (a) Decision Tree regression, (b) Random Forest regression, (c) Generalized Additive regression, and (d) Linear regression.
Figure 3. Predictions vs. observations plots for (a) Decision Tree regression, (b) Random Forest regression, (c) Generalized Additive regression, and (d) Linear regression.
Energies 17 05306 g003
Table 1. Calculated hydrate formation temperature according to different compositions of natural gas.
Table 1. Calculated hydrate formation temperature according to different compositions of natural gas.
N2CO2CH4C2H6C3H8nC4nC5nC6nC7H2OT (°C)P (bar)SG
0099.8400.1500000.01−4.0950.000.56
0099.8400.1500000.01−4.0950.00
0099.8200.1500000.037.4850.00
0099.800.1500000.057.4850.00
0098.1500000.920.920.011.49100.070.60
0098.1300000.920.920.0316.03100.07
0098.1100000.920.920.0523.26100.07
0098.0900000.920.920.0728.20100.07
3.8092.9102.6000.6800.010.7474.870.61
3.8092.8902.6000.6800.0314.5874.87
3.8092.8702.6000.6800.0521.4974.87
3.8092.8502.6000.6800.0717.0974.87
2.81.593.87001.300.5200.01−0.6774.87
02.7594.480000.920.920.920.011.03100.070.65
02.7594.460000.920.920.920.0315.63100.07
02.7594.440000.920.920.920.0522.89100.07
02.7594.420000.920.920.920.0727.85100.07
Table 2. Comparison of studies in the literature for the estimation of the HFT and this study.
Table 2. Comparison of studies in the literature for the estimation of the HFT and this study.
Data SizeMethodValidation Method (Train/Test)Specific GravityPressure
(MPa)
Temperature (K)Ref. No
203ANN136/670.55–11.37–18.47274.1–297.4[22]
120GA-PSA 0.28–100275–330[23]
377ANN283/940.55–1.520.042–548178.3–324.1[24]
2387ANN 0.55–2.010.04–0.39148–320[25]
987ANN80/20 3–10264–284[26]
279GEP223/560.56–0.830.58–62.85273.7–303.1[27]
343GMDH241/102 0.58–62.85273.2–304.8[28]
203,820ML%70/300.555–0.716500.0–1.01−42.5–34.0This Study
Table 3. Gas quality specification of the BOTAS Gas Network Code on Transmission System.
Table 3. Gas quality specification of the BOTAS Gas Network Code on Transmission System.
Chemical ComponentMole %
CH482–100
C2H60–12
C3H80–4
nC40–2.5
nC5+0–1
N20–5.8
CO20–3
O20–0.5
Table 4. Statistical information of data.
Table 4. Statistical information of data.
MinMaxMeanStd
N20.0005.8000.9361.701
CO20.0003.0000.8871.030
CH475.13099.87093.8884.792
C2H60.00012.0002.4563.799
C3H80.0004.0000.7031.208
iC40.0000.0000.0000.000
nC40.0002.5000.5070.791
neoC50.0000.0000.0000.000
nC50.0001.0000.2030.316
iC50.0000.0000.0000.000
nC60.0001.0000.1870.309
nC70.0001.0000.1900.310
nC80.0000.0000.0000.000
nC90.0000.0000.0000.000
H2O0.0100.0700.0400.022
Tc−42.50934.0215.48913.708
Pc1.013500.00056.63245.473
Table 5. Performance analysis of the proposed machine learning models.
Table 5. Performance analysis of the proposed machine learning models.
ModelsR2Adj. R2RMSEN-RMSEAAE
DTR0.9990.9990.3512.047 *0.129 *
RFR0.9980.9980.65911.222 *0.401 *
GAM0.9640.9642.59320.012 *1.316 *
LM0.6040.6048.62733.805 *2.523 *
* values < 1 × 10−3 neglected.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Karaköse, M.; Yücel, Ö. Predictive Modeling of the Hydrate Formation Temperature in Highly Pressurized Natural Gas Pipelines. Energies 2024, 17, 5306. https://doi.org/10.3390/en17215306

AMA Style

Karaköse M, Yücel Ö. Predictive Modeling of the Hydrate Formation Temperature in Highly Pressurized Natural Gas Pipelines. Energies. 2024; 17(21):5306. https://doi.org/10.3390/en17215306

Chicago/Turabian Style

Karaköse, Mustafa, and Özgün Yücel. 2024. "Predictive Modeling of the Hydrate Formation Temperature in Highly Pressurized Natural Gas Pipelines" Energies 17, no. 21: 5306. https://doi.org/10.3390/en17215306

APA Style

Karaköse, M., & Yücel, Ö. (2024). Predictive Modeling of the Hydrate Formation Temperature in Highly Pressurized Natural Gas Pipelines. Energies, 17(21), 5306. https://doi.org/10.3390/en17215306

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop