MDPI - Publisher of Open Access Journals

19 pages, 484 KiB

Open AccessArticle

BiModalClust: Fused Data and Neighborhood Variation for Advanced K-Means Big Data Clustering

by Ravil Mussabayev and Rustam Mussabayev

Appl. Sci. 2025, 15(3), 1032; https://doi.org/10.3390/app15031032 - 21 Jan 2025

Viewed by 524

K-means clustering is a fundamental tool in data mining, yet its scalability and efficacy decline when faced with massive datasets. In this work, we introduce BiModalClust, a novel clustering algorithm that leverages a bimodal optimization paradigm to overcome these challenges. Our approach simultaneously [...] Read more.

K-means clustering is a fundamental tool in data mining, yet its scalability and efficacy decline when faced with massive datasets. In this work, we introduce BiModalClust, a novel clustering algorithm that leverages a bimodal optimization paradigm to overcome these challenges. Our approach simultaneously optimizes two interdependent modalities: the input data stream and the neighborhood structure of the solution landscape, which emerges from iterative restrictions of the Minimum Sum-of-Squares Clustering (MSSC) objective function to sampled subsets of the data. By integrating the Variable Neighborhood Search (VNS) metaheuristic, we systematically explore and refine these landscapes through dynamic reinitialization of degenerate centroids and adaptive exploration of expanding neighborhoods. This dual-stream optimization not only transforms traditional local search into a more global and robust process but also ensures computational scalability and precision. Extensive experimentation on diverse real-world datasets demonstrates that BiModalClust achieves superior clustering performance among K-means-based methods in big data environments. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

23 pages, 29777 KiB

Open AccessArticle

Monitoring and Prevention Strategies for Iron and Aluminum Pollutants in Acid Mine Drainage (AMD): Evidence from Xiaomixi Stream in Qinling Mountains

by Xiaoya Wang, Min Yang, Huaqing Chen, Zongming Cai, Weishun Fu, Xin Zhang, Fangqiang Sun and Yangquan Li

Minerals 2025, 15(1), 59; https://doi.org/10.3390/min15010059 - 8 Jan 2025

Viewed by 672

Abstract

Acid mine drainage (AMD) generated during the exploitation and utilization of mineral resources poses a severe environmental problem globally within the mining industry. The Xiaomixi Stream in Ziyang County, Shaanxi Province, is a primary tributary of the Han River, which is surrounded by [...] Read more.

Acid mine drainage (AMD) generated during the exploitation and utilization of mineral resources poses a severe environmental problem globally within the mining industry. The Xiaomixi Stream in Ziyang County, Shaanxi Province, is a primary tributary of the Han River, which is surrounded by historically concentrated mining areas for stone coal and vanadium ores. Rainwater erosion of abandoned mine tunnels and waste rock piles has led to the leaching of acidic substances and heavy metals, which then enter the Haoping River and its tributaries through surface runoff. This results in acidic water, posing a significant threat to the water quality of the South-to-North Water Diversion Middle Route within the Han River basin. According to this study’s investigation, Xiaomixi’s acidic water exhibits yellow and white precipitates upstream and downstream of the river, respectively. These precipitates stem from the oxidation of iron-bearing minerals and aluminum-bearing minerals. The precipitation process is controlled by factors such as the pH and temperature, exhibiting seasonal variations. Taking the Xiaomixi Stream in Ziyang County, Shaanxi Province, as the study area, this paper conducts field investigations, systematic sampling of water bodies and river sediments, testing for iron and aluminum pollutants in water, and micro-area observations using field emission scanning electron microscopy (FESEM) on sediments, along with analyzing the iron and aluminum content. The deposition is analyzed using handheld X-ray fluorescence (XRF) analyzers, X-ray diffraction (XRD), and visible–near-infrared spectroscopy data, and a geochemical model is established using PHREEQC software. This paper summarizes the migration and transformation mechanisms of iron and aluminum pollutants in acidic water and proposes appropriate prevention and control measures. Full article

(This article belongs to the Special Issue Acid Mine Drainage: A Challenge or an Opportunity?)

► Show Figures

Figure 1

29 pages, 1577 KiB

Open AccessArticle

DIAFM: An Improved and Novel Approach for Incremental Frequent Itemset Mining

by Mohsin Shaikh, Sabina Akram, Jawad Khan, Shah Khalid and Youngmoon Lee

Mathematics 2024, 12(24), 3930; https://doi.org/10.3390/math12243930 - 13 Dec 2024

Viewed by 630

Abstract

Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one [...] Read more.

Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one of the key algorithms in data mining and finds applications in a variety of domains; however, traditional algorithms do face problems in efficiently processing large and dynamic datasets. This research introduces a distributed incremental approximation frequent itemset mining (DIAFM) algorithm that tackles the mentioned challenges using shard-based approximation within the MapReduce framework. DIAFM minimizes the computational overhead of a program by reducing dataset scans, bypassing exact support checks, and incorporating shard-level error thresholds for an appropriate trade-off between efficiency and accuracy. Extensive experiments have demonstrated that DIAFM reduces runtime by 40–60% compared to traditional methods with losses in accuracy within 1–5%, even for datasets over 500,000 transactions. Its incremental nature ensures that new data increments are handled efficiently without needing to reprocess the entire dataset, making it particularly suitable for real-time, large-scale applications such as transaction analysis and IoT data streams. These results demonstrate the scalability, robustness, and practical applicability of DIAFM and establish it as a competitive and efficient solution for mining frequent itemsets in distributed, dynamic environments. Full article

(This article belongs to the Special Issue Advances in Mathematical Methods for Distributed Learning and High-Dimensional Data Analysis)

► Show Figures

Figure 1

17 pages, 2189 KiB

Open AccessArticle

Refinement and Validation of the SPEcies at Risk Index for Metals (SPEAR_metal Index) for Assessing Ecological Impacts of Metal Contamination in the Nakdong River, South Korea

by Dae-sik Hwang, Jongwoo Kim, Jiwoong Chung and Jonghyeon Lee

Water 2024, 16(22), 3308; https://doi.org/10.3390/w16223308 - 18 Nov 2024

Viewed by 587

Abstract

The SPEcies At Risk index for metals (

{SPEAR}_{m e t a l}

index) was refined using updated physiological sensitivity data and validated to assess the ecological impact of metal contamination on benthic macroinvertebrate communities in the upper Nakdong River, near a [...] Read more.

The SPEcies At Risk index for metals (

{SPEAR}_{m e t a l}

index) was refined using updated physiological sensitivity data and validated to assess the ecological impact of metal contamination on benthic macroinvertebrate communities in the upper Nakdong River, near a Zn smelter in Korea. Biosurvey and chemical monitoring data were collected at 18 sites surrounding the smelter and nearby mines. Acute ecotoxicity tests on 20 indigenous species from the Korean peninsula were conducted and used to update taxon-specific metal sensitivity data. The refined

{SPEAR}_{m e t a l}

index, based on this updated sensitivity, was significantly lower than previous versions, with most values below the severe impact threshold (0.5) in the main stream. The correlation between hazard quotients in water and the SPEAR index improved, with the correlation coefficient increasing from 0.63 to 0.70. Despite consistently high benthic macroinvertebrate indices (BMIs) across the study area, generic ecological indices, such as total richness, EPT (Ephemeroptera, Plecoptera, and Trichoptera taxa richness), and Shannon’s diversity index, showed correlations with metal contamination levels. Principal component analysis identified the

{SPEAR}_{m e t a l}

index as the primary indicator associated with metal contamination in both water and sediment. These findings highlight the improved performance of the refined

{SPEAR}_{m e t a l}

index as a more sensitive and specific tool for assessing the ecological status of metal-impacted aquatic ecosystems compared to traditional indices. Full article

(This article belongs to the Section Water Quality and Contamination)

► Show Figures

Figure 1

19 pages, 1160 KiB

Open AccessArticle

Enhancing Explainable Recommendations: Integrating Reason Generation and Rating Prediction through Multi-Task Learning

by Xingyu Zhu, Xiaona Xia, Yuheng Wu and Wenxu Zhao

Appl. Sci. 2024, 14(18), 8303; https://doi.org/10.3390/app14188303 - 14 Sep 2024

Viewed by 1439

Abstract

In recent years, recommender systems—which provide personalized recommendations by analyzing users’ historical behavior to infer their preferences—have become essential tools across various domains, including e-commerce, streaming media, and social platforms. Recommender systems play a crucial role in enhancing user experience by mining vast [...] Read more.

In recent years, recommender systems—which provide personalized recommendations by analyzing users’ historical behavior to infer their preferences—have become essential tools across various domains, including e-commerce, streaming media, and social platforms. Recommender systems play a crucial role in enhancing user experience by mining vast amounts of data to identify what is most relevant to users. Among these, deep learning-based recommender systems have demonstrated exceptional recommendation performance. However, these “black-box” systems lack reasonable explanations for their recommendation results, which reduces their impact and credibility. To address this situation, an effective strategy is to provide a personalized textual explanation along with the recommendation. This approach has received increasing attention from researchers because it can enhance users’ trust in recommender systems through intuitive explanations. In this context, our paper introduces a novel explainable recommendation model named GCLTE. This model integrates Graph Contrastive Learning with transformers within an Encoder–Decoder framework to perform rating prediction and reason generation simultaneously. In addition, we cleverly combine the neural network layer with the transformer using a straightforward information enhancement operation. Finally, our extensive experiments on three real-world datasets demonstrate the effectiveness of GCLTE in both recommendation and explanation. The experimental results show that our model outperforms the top existing models. Full article

(This article belongs to the Special Issue Applied and Innovative Computational Intelligence Systems: 3rd Edition)

► Show Figures

Figure 1

19 pages, 7689 KiB

Open AccessArticle

Development of High-Silica Adakitic Intrusions in the Northern Appalachians of New Brunswick (Canada), and Their Correlation with Slab Break-Off: Insights into the Formation of Fertile Cu-Au-Mo Porphyry Systems

by Fazilat Yousefi, David R. Lentz, James A. Walker and Kathleen G. Thorne

Geosciences 2024, 14(9), 241; https://doi.org/10.3390/geosciences14090241 - 7 Sep 2024

Cited by 1 | Viewed by 1116

Abstract

High-silica adakites exhibit specific compositions, as follows: SiO₂ ≥ 56 wt.%, Al₂O₃ ≥ 15 wt.%, Y ≤ 18 ppm, Yb ≤ 1.9 ppm, K₂O/Na₂O ≥ 1, MgO < 3 wt.%, high Sr/Y (≥10), and La/Yb [...] Read more.

High-silica adakites exhibit specific compositions, as follows: SiO₂ ≥ 56 wt.%, Al₂O₃ ≥ 15 wt.%, Y ≤ 18 ppm, Yb ≤ 1.9 ppm, K₂O/Na₂O ≥ 1, MgO < 3 wt.%, high Sr/Y (≥10), and La/Yb (>10). Devonian I-type adakitic granitoids in the northern Appalachians of New Brunswick (NB, Canada) share geochemical signatures of adakites elsewhere, i.e., SiO₂ ≥ 66.46 wt.%, Al₂O₃ > 15.47 wt.%, Y ≤ 22 ppm, Yb ≤ 2 ppm, K₂O/Na₂O > 1, MgO < 3 wt.%, Sr/Y ≥ 33 to 50, and La/Yb > 10. Remarkably, adakitic intrusions in NB, including the Blue Mountain Granodiorite Suite, Nicholas Denys, Sugar Loaf, Squaw Cap, North Dungarvan River, Magaguadavic Granite, Hampstead Granite, Tower Hill, Watson Brook Granodiorite, Rivière-Verte Porphyry, Eagle Lake Granite, Evandale Granodiorite, North Pole Stream Suite, and the McKenzie Gulch porphyry dykes all have associated Cu mineralization, similar to the Middle Devonian Cu porphyry intrusions in Mines Gaspé, Québec. Trace element data support the connection between adakite formation and slab break-off, a mechanism influencing fertility and generation of porphyry Cu systems. These adakitic rocks in NB are oxidized, and are relatively enriched in large ion lithophile elements, like Cs, Rb, Ba, and Pb, and depleted in some high field strength elements, like Y, Nb, Ta, P, and Ti; they also have Sr/Y ≥ 33 to 50, Nb/Y > 0.4, Ta/Yb > 0.3, La/Yb > 10, Ta/Yb > 0.3, Sm/Yb > 2.5, Gd/Yb > 2.0, Nb + Y < 60 ppm, and Ta + Yb < 6 ppm. These geochemical indicators point to failure of a subducting oceanic slab (slab rollback to slab break-off) in the terminal stages of subduction, as the generator of post-collisional granitoid magmatism. The break-off and separation of a dense subducted oceanic plate segment leads to upwelling asthenosphere, heat advection, and selective partial melting of the descending oceanic slab (adakite) and (or) suprasubduction zone lithospheric mantle. The resulting silica-rich adakitic magmas ascend through thickened mantle lithosphere, with minimal affect from the asthenosphere. The critical roles of transpression and transtension are highlighted in facilitating the ascent and emplacement of these fertile adakitic magmas in postsubduction zone settings. Full article

(This article belongs to the Special Issue Zircon U-Pb Geochronology Applied to Tectonics and Ore Deposits)

► Show Figures

Figure 1

18 pages, 10594 KiB

Open AccessArticle

A Framework for Characterizing Spatio-Temporal Variation of Turbidity and Drivers in the Navigable and Turbid River: A Case Study of Xitiaoxi River

by Min Zhang, Renhua Yan, Junfeng Gao, Suding Yan and Jialong Yan

Water 2024, 16(17), 2503; https://doi.org/10.3390/w16172503 - 3 Sep 2024

Viewed by 1051

Abstract

Turbidity, as a key indicator of water quality linked to underwater light attenuation, is crucial for evaluating water quality. Control in high-turbidity water environments plays a critical role in navigable rivers. For this purpose, our study proposed a framework for analyzing the spatio-temporal [...] Read more.

Turbidity, as a key indicator of water quality linked to underwater light attenuation, is crucial for evaluating water quality. Control in high-turbidity water environments plays a critical role in navigable rivers. For this purpose, our study proposed a framework for analyzing the spatio-temporal variation of turbidity and its driving factors in a navigable and turbid river using in situ measurement data, satellite data, socioeconomic data, a power index function model, and correlation analysis. The results show that the proposed model is feasible for quantitative turbidity monitoring of the Xitiaoxi River. Its upstream turbidity is lower than downstream, with seasonal averages for spring, summer, autumn, and winter of 93.9, 111.3, 113.5, and 120.9 NTU, respectively. Furthermore, the turbidity in the middle and lower reaches of the Xitiaoxi River continuously increased before 2005 and began to decline after 2005 due to the policy of mining moratorium. This trend is especially noticeable at monitoring points along the main stream of the Xitiaoxi River, such as downstream of the Xitiaoxi River (S1), Gangkou station (S2), middle reaches of the Xitiaoxi River (S4), Hengtangcun station (S6), upper stream of the Xitiaoxi River (S7), and Huxi River (S8). Mining and shipping have significantly contributed to the turbidity of the target river. This framework offers a practical approach for assessing the environmental impacts of both natural and anthropogenic factors, thereby providing valuable insights for river management practices. Full article

► Show Figures

Figure 1

24 pages, 2073 KiB

Open AccessReview

Overview of Wind and Photovoltaic Data Stream Classification and Data Drift Issues

by Xinchun Zhu, Yang Wu, Xu Zhao, Yunchen Yang, Shuangquan Liu, Luyi Shi and Yelong Wu

Energies 2024, 17(17), 4371; https://doi.org/10.3390/en17174371 - 1 Sep 2024

Viewed by 1100

Abstract

The development in the fields of clean energy, particularly wind and photovoltaic power, generates a large amount of data streams, and how to mine valuable information from these data to improve the efficiency of power generation has become a hot spot of current [...] Read more.

The development in the fields of clean energy, particularly wind and photovoltaic power, generates a large amount of data streams, and how to mine valuable information from these data to improve the efficiency of power generation has become a hot spot of current research. Traditional classification algorithms cannot cope with dynamically changing data streams, so data stream classification techniques are particularly important. The current data stream classification techniques mainly include decision trees, neural networks, Bayesian networks, and other methods, which have been applied to wind power and photovoltaic power data processing in existing research. However, the data drift problem is gradually highlighted due to the dynamic change in data, which significantly impacts the performance of classification algorithms. This paper reviews the latest research on data stream classification technology in wind power and photovoltaic applications. It provides a detailed introduction to the data drift problem in machine learning, which significantly affects algorithm performance. The discussion covers covariate drift, prior probability drift, and concept drift, analyzing their potential impact on the practical deployment of data stream classification methods in wind and photovoltaic power sectors. Finally, by analyzing examples for addressing data drift in energy-system data stream classification, the article highlights the future prospects of data drift research in this field and suggests areas for improvement. Combined with the systematic knowledge of data stream classification techniques and data drift handling presented, it offers valuable insights for future research. Full article

(This article belongs to the Special Issue Advances in Renewable Energy Power Forecasting and Integration)

► Show Figures

Figure 1

17 pages, 24301 KiB

Open AccessArticle

Hydrodynamic Model of the Area of the Żelazny Most Mining Waste Storage Facility to Reconstruct the Migration of Saline Groundwater

by Jacek Gurwin, Marek Wcisło, Stanisław Staśko, Sebastian Buczyński, Magdalena Modelska, Tomasz Olichwer and Robert Tarka

Water 2024, 16(17), 2431; https://doi.org/10.3390/w16172431 - 28 Aug 2024

Viewed by 939

Abstract

This paper presents the construction of a numerical three-dimensional model of the area of the Żelazny Most Mining Waste Storage Facility (MWSF). In the study area, the difficult geological conditions associated with glaciotectonics are accompanied by a complex hydrotechnical system of sediment deposition [...] Read more.

This paper presents the construction of a numerical three-dimensional model of the area of the Żelazny Most Mining Waste Storage Facility (MWSF). In the study area, the difficult geological conditions associated with glaciotectonics are accompanied by a complex hydrotechnical system of sediment deposition and sedimentary water drainage. In order to effectively reflect the water flow paths, a detailed schematization was carried out, using 700,000 boreholes and more than 300 hydrogeological cross-sections. In addition, numerous drainage sections, streams, and ditches were included to reliably assess the amount of saline water entering the underlying aquifers. This research was supported by magnetic resonance sounding (MRS) studies of the reservoir’s sediments. The MWSF is currently being expanded, so the work primarily focuses on illustrating changes in the hydrodynamic field resulting from the inclusion of the new southern section. Models of similar facilities have been implemented before, but in the current one, the combination of meticulous analysis of the hydro-structural system, the water balance, a significant amount of data, the size of the facility, and the use of an unstructured discretization grid in the calculations is undoubtedly innovative and will be an important contribution to the development of analogous solutions around the world. Full article

(This article belongs to the Special Issue Groundwater Monitoring, Assessment and Modelling)

► Show Figures

Figure 1

32 pages, 4130 KiB

Open AccessArticle

An Adaptive Active Learning Method for Multiclass Imbalanced Data Streams with Concept Drift

by Meng Han, Chunpeng Li, Fanxing Meng, Feifei He and Ruihua Zhang

Appl. Sci. 2024, 14(16), 7176; https://doi.org/10.3390/app14167176 - 15 Aug 2024

Viewed by 1137

Abstract

Learning from multiclass imbalanced data streams with concept drift and variable class imbalance ratios under a limited label budget presents new challenges in the field of data mining. To address these challenges, this paper proposes an adaptive active learning method for multiclass imbalanced [...] Read more.

Learning from multiclass imbalanced data streams with concept drift and variable class imbalance ratios under a limited label budget presents new challenges in the field of data mining. To address these challenges, this paper proposes an adaptive active learning method for multiclass imbalanced data streams with concept drift (AdaAL-MID). Firstly, a dynamic label budget strategy under concept drift scenarios is introduced, which allocates label budgets reasonably at different stages of the data stream to effectively handle concept drift. Secondly, an uncertainty-based label request strategy using a dual-margin dynamic threshold matrix is designed to enhance learning opportunities for minority class instances and those that are challenging to classify, and combined with a random strategy, it can estimate the current class imbalance distribution by accessing only a limited number of instance labels. Finally, an instance-adaptive sampling strategy is proposed, which comprehensively considers the imbalance ratio and classification difficulty of instances, and combined with a weighted ensemble strategy, improves the classification performance of the ensemble classifier in imbalanced data streams. Extensive experiments and analyses demonstrate that AdaAL-MID can handle various complex concept drifts and adapt to changes in class imbalance ratios, and it outperforms several state-of-the-art active learning algorithms. Full article

► Show Figures

Figure 1

29 pages, 3714 KiB

Open AccessArticle

Variance Feedback Drift Detection Method for Evolving Data Streams Mining

by Meng Han, Fanxing Meng and Chunpeng Li

Appl. Sci. 2024, 14(16), 7157; https://doi.org/10.3390/app14167157 - 15 Aug 2024

Viewed by 890

Abstract

Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification [...] Read more.

Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification efficiency of the original classifier, that is, the old decision-making model is not suitable for the new data environment. Therefore, dealing with concept drift from changing data streams is crucial to guarantee classifier performance. Currently, most concept drift detection methods apply the same detection strategy to different data streams, with little attention to the uniqueness of each data stream. This limits the adaptability of drift detectors to different environments. In our research, we designed a unique solution to address this issue. First, we proposed a variance estimation strategy and a variance feedback strategy to characterize the data stream’s characteristics through variance. Based on this variance, we developed personalized drift detection schemes for different data streams, thereby enhancing the adaptability of drift detection in various environments. We conducted experiments on data streams with various types of drifts. The experimental results show that our algorithm achieves the best average ranking for accuracy on the synthetic dataset, with an overall ranking 1.12 to 1.5 higher than the next-best algorithm. In comparison with algorithms using the same tests, our method improves the ranking by 3 to 3.5 for the Hoeffding test and by 1.12 to 2.25 for the McDiarmid test. In addition, they achieve a good balance between detection delay and false positive rates. Finally, our algorithm ranks higher than existing drift detection methods across the four key metrics of accuracy, CPU time, false positives, and detection delay, meeting our expectations. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

24 pages, 16296 KiB

Open AccessArticle

Improving Mineral Classification Using Multimodal Hyperspectral Point Cloud Data and Multi-Stream Neural Network

by Aldino Rizaldy, Ahmed Jamal Afifi, Pedram Ghamisi and Richard Gloaguen

Remote Sens. 2024, 16(13), 2336; https://doi.org/10.3390/rs16132336 - 26 Jun 2024

Viewed by 2410

Abstract

In this paper, we leverage multimodal data to classify minerals using a multi-stream neural network. In a previous study on the Tinto dataset, which consisted of a 3D hyperspectral point cloud from the open-pit mine Corta Atalaya in Spain, we successfully identified mineral [...] Read more.

In this paper, we leverage multimodal data to classify minerals using a multi-stream neural network. In a previous study on the Tinto dataset, which consisted of a 3D hyperspectral point cloud from the open-pit mine Corta Atalaya in Spain, we successfully identified mineral classes by employing various deep learning models. However, this prior work solely relied on hyperspectral data as input for the deep learning models. In this study, we aim to enhance accuracy by incorporating multimodal data, which includes hyperspectral images, RGB images, and a 3D point cloud. To achieve this, we have adopted a graph-based neural network, known for its efficiency in aggregating local information, based on our past observations where it consistently performed well across different hyperspectral sensors. Subsequently, we constructed a multi-stream neural network tailored to handle multimodality. Additionally, we employed a channel attention module on the hyperspectral stream to fully exploit the spectral information within the hyperspectral data. Through the integration of multimodal data and a multi-stream neural network, we achieved a notable improvement in mineral classification accuracy: 19.2%, 4.4%, and 5.6% on the LWIR, SWIR, and VNIR datasets, respectively. Full article

(This article belongs to the Special Issue Remote Sensing for Geology and Mapping)

► Show Figures

Figure 1

15 pages, 3882 KiB

Open AccessArticle

A Dual-Stream Cross AGFormer-GPT Network for Traffic Flow Prediction Based on Large-Scale Road Sensor Data

by Yu Sun, Yajing Shi, Kaining Jia, Zhiyuan Zhang and Li Qin

Sensors 2024, 24(12), 3905; https://doi.org/10.3390/s24123905 - 17 Jun 2024

Viewed by 1043

Abstract

Traffic flow prediction can provide important reference data for managers to maintain traffic order, and can also be based on personal travel plans for optimal route selection. On account of the development of sensors and data collection technology, large-scale road network historical data [...] Read more.

Traffic flow prediction can provide important reference data for managers to maintain traffic order, and can also be based on personal travel plans for optimal route selection. On account of the development of sensors and data collection technology, large-scale road network historical data can be effectively used, but their high non-linearity makes it meaningful to establish effective prediction models. In this regard, this paper proposes a dual-stream cross AGFormer-GPT network with prompt engineering for traffic flow prediction, which integrates traffic occupancy and speed as two prompts into traffic flow in the form of cross-attention, and uniquely mines spatial correlation and temporal correlation information through the dual-stream cross structure, effectively combining the advantages of the adaptive graph neural network and large language model to improve prediction accuracy. The experimental results on two PeMS road network data sets have verified that the model has improved by about 1.2% in traffic prediction accuracy under different road networks. Full article

(This article belongs to the Special Issue Feature Papers in the 'Sensor Networks' Section 2024)

► Show Figures

Figure 1

19 pages, 331 KiB

Open AccessArticle

An Efficient Probabilistic Algorithm to Detect Periodic Patterns in Spatio-Temporal Datasets

by Claudio Gutiérrez-Soto, Patricio Galdames and Marco A. Palomino

Big Data Cogn. Comput. 2024, 8(6), 59; https://doi.org/10.3390/bdcc8060059 - 3 Jun 2024

Viewed by 1321

Abstract

Deriving insight from data is a challenging task for researchers and practitioners, especially when working on spatio-temporal domains. If pattern searching is involved, the complications introduced by temporal data dimensions create additional obstacles, as traditional data mining techniques are insufficient to address spatio-temporal [...] Read more.

Deriving insight from data is a challenging task for researchers and practitioners, especially when working on spatio-temporal domains. If pattern searching is involved, the complications introduced by temporal data dimensions create additional obstacles, as traditional data mining techniques are insufficient to address spatio-temporal databases (STDBs). We hereby present a new algorithm, which we refer to as F1/FP, and can be described as a probabilistic version of the Minus-F1 algorithm to look for periodic patterns. To the best of our knowledge, no previous work has compared the most cited algorithms in the literature to look for periodic patterns—namely, Apriori, MS-Apriori, FP-Growth, Max-Subpattern, and PPA. Thus, we have carried out such comparisons and then evaluated our algorithm empirically using two datasets, showcasing its ability to handle different types of periodicity and data distributions. By conducting such a comprehensive comparative analysis, we have demonstrated that our newly proposed algorithm has a smaller complexity than the existing alternatives and speeds up the performance regardless of the size of the dataset. We expect our work to contribute greatly to the mining of astronomical data and the permanently growing online streams derived from social media. Full article

(This article belongs to the Special Issue Big Data and Information Science Technology)

21 pages, 9980 KiB

Open AccessCase Report

The Study of Groundwater in the Zhambyl Region, Southern Kazakhstan, to Improve Sustainability

by Dinara Adenova, Dani Sarsekova, Malis Absametov, Yermek Murtazin, Janay Sagin, Ludmila Trushel and Oxana Miroshnichenko

Sustainability 2024, 16(11), 4597; https://doi.org/10.3390/su16114597 - 29 May 2024

Cited by 6 | Viewed by 1978

Abstract

Water resources are scarce and difficult to manage in Kazakhstan, Central Asia (CA). Anthropic activities largely eliminated the Aral Sea. Afghanistan’s large-scale canal construction may eliminate life in the main stream of the Amu Darya River, CA. Kazakhstan’s HYRASIA ONE project, with a [...] Read more.

Water resources are scarce and difficult to manage in Kazakhstan, Central Asia (CA). Anthropic activities largely eliminated the Aral Sea. Afghanistan’s large-scale canal construction may eliminate life in the main stream of the Amu Darya River, CA. Kazakhstan’s HYRASIA ONE project, with a EUR 50 billion investment to produce green hydrogen, is targeted to withdraw water from the Caspian Sea. Kazakhstan, CA, requires sustainable programs that integrate both decision-makers’ and people’s behavior. For this paper, the authors investigated groundwater resources for sustainable use, including for consumption, and the potential for natural “white” hydrogen production from underground geological “factories”. Kazakhstan is rich in natural resources, such as iron-rich rocks, minerals, and uranium, which are necessary for serpentinization reactions and radiolysis decay in natural hydrogen production from underground water. Investigations of underground geological “factories” require substantial efforts in field data collection. A chemical analysis of 40 groundwater samples from the 97 wells surveyed and investigated in the T. Ryskulov, Zhambyl, Baizak and Zhualy districts of the Zhambyl region in South Kazakhstan in 2021–2022 was carried out. These samples were compared with previously collected water samples from the years 2020–2021. The compositions of groundwater samples were analyzed, revealing various concentrations of different minerals, natural geological rocks, and anthropogenic materials. South Kazakhstan is rich in natural mineral resources. As a result, mining companies extract resources in the Taraz–Zhanatas–Karatau and the Shu–Novotroitsk industrial areas. The most significant levels of minerals found in water samples were found in the territory of the Talas–Assinsky interfluve, where the main industrial mining enterprises are concentrated and the largest groundwater deposits have been explored. Groundwater compositions have direct connections to geological rocks. The geological rocks are confined to sandstones, siltstones, porphyrites, conglomerates, limestones, and metamorphic rocks. In observation wells, a number of components can be found in high concentrations (mg/L): sulfates—602.0 (MPC 500 mg/L); sodium—436.5 (MPC 200 mg/L); chlorine—465.4 (MPC 350 mg/L); lithium—0.18 (MPC 0.03 mg/L); boron—0.74 (MPC 0.5 mg/L); cadmium—0.002 (MPC 0.001 mg/L); strontium—15, 0 (MPC 7.0 mg/L); and TDS—1970 (MPC 1000). The high mineral contents in the water are natural and comprise minerals from geological sources, including iron-rich rocks, to uranium. Proper groundwater classifications for research investigations are required to separate potable groundwater resources, wells, and areas where underground geological “factories” producing natural “white” hydrogen could potentially be located. Our preliminary investigation results are presented with the aim of creating a large-scale targeted program to improve water sustainability in Kazakhstan, CA. Full article

(This article belongs to the Special Issue Sustainable Water Resources Management under Growing Anthropic Demands and the Effects of Climate Change)

► Show Figures

Figure 1

Search Results (152)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (152)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI