[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (150)

Search Parameters:
Keywords = data stream mining

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 1577 KiB  
Article
DIAFM: An Improved and Novel Approach for Incremental Frequent Itemset Mining
by Mohsin Shaikh, Sabina Akram, Jawad Khan, Shah Khalid and Youngmoon Lee
Mathematics 2024, 12(24), 3930; https://doi.org/10.3390/math12243930 - 13 Dec 2024
Viewed by 373
Abstract
Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one [...] Read more.
Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one of the key algorithms in data mining and finds applications in a variety of domains; however, traditional algorithms do face problems in efficiently processing large and dynamic datasets. This research introduces a distributed incremental approximation frequent itemset mining (DIAFM) algorithm that tackles the mentioned challenges using shard-based approximation within the MapReduce framework. DIAFM minimizes the computational overhead of a program by reducing dataset scans, bypassing exact support checks, and incorporating shard-level error thresholds for an appropriate trade-off between efficiency and accuracy. Extensive experiments have demonstrated that DIAFM reduces runtime by 40–60% compared to traditional methods with losses in accuracy within 1–5%, even for datasets over 500,000 transactions. Its incremental nature ensures that new data increments are handled efficiently without needing to reprocess the entire dataset, making it particularly suitable for real-time, large-scale applications such as transaction analysis and IoT data streams. These results demonstrate the scalability, robustness, and practical applicability of DIAFM and establish it as a competitive and efficient solution for mining frequent itemsets in distributed, dynamic environments. Full article
Show Figures

Figure 1

Figure 1
<p>Data mining algorithms and evolution.</p>
Full article ">Figure 2
<p>Steps involved in DIAFM.</p>
Full article ">Figure 3
<p>DIAFIM-workflow.</p>
Full article ">Figure 4
<p>Incremental data (<math display="inline"><semantics> <mrow> <mo>Δ</mo> <msub> <mi>D</mi> <mi>t</mi> </msub> </mrow> </semantics></math>) sharding—example.</p>
Full article ">Figure 5
<p>Local itemset mining—workflow.</p>
Full article ">Figure 6
<p>Increamental itemset mining—workflow.</p>
Full article ">Figure 7
<p>Approximate global itemset mining—workflow.</p>
Full article ">Figure 8
<p>Profile integration—workflow.</p>
Full article ">Figure 9
<p>Profile integration example.</p>
Full article ">Figure 10
<p>Distributed non-incremental data cleanup.</p>
Full article ">Figure 11
<p>DIAFM Data cleanup.</p>
Full article ">Figure 12
<p>DIAFM MapReduce scaleup.</p>
Full article ">Figure 13
<p>DIAFM incremental setup.</p>
Full article ">Figure 14
<p>Hierarchy of FIM-based algorithms.</p>
Full article ">Figure 15
<p>Comparative analysis for execution time.</p>
Full article ">Figure 16
<p>Comparative analysis for memory usage.</p>
Full article ">
17 pages, 2189 KiB  
Article
Refinement and Validation of the SPEcies at Risk Index for Metals (SPEARmetal Index) for Assessing Ecological Impacts of Metal Contamination in the Nakdong River, South Korea
by Dae-sik Hwang, Jongwoo Kim, Jiwoong Chung and Jonghyeon Lee
Water 2024, 16(22), 3308; https://doi.org/10.3390/w16223308 - 18 Nov 2024
Viewed by 394
Abstract
The SPEcies At Risk index for metals (SPEARmetal index) was refined using updated physiological sensitivity data and validated to assess the ecological impact of metal contamination on benthic macroinvertebrate communities in the upper Nakdong River, near a [...] Read more.
The SPEcies At Risk index for metals (SPEARmetal index) was refined using updated physiological sensitivity data and validated to assess the ecological impact of metal contamination on benthic macroinvertebrate communities in the upper Nakdong River, near a Zn smelter in Korea. Biosurvey and chemical monitoring data were collected at 18 sites surrounding the smelter and nearby mines. Acute ecotoxicity tests on 20 indigenous species from the Korean peninsula were conducted and used to update taxon-specific metal sensitivity data. The refined SPEARmetal index, based on this updated sensitivity, was significantly lower than previous versions, with most values below the severe impact threshold (0.5) in the main stream. The correlation between hazard quotients in water and the SPEAR index improved, with the correlation coefficient increasing from 0.63 to 0.70. Despite consistently high benthic macroinvertebrate indices (BMIs) across the study area, generic ecological indices, such as total richness, EPT (Ephemeroptera, Plecoptera, and Trichoptera taxa richness), and Shannon’s diversity index, showed correlations with metal contamination levels. Principal component analysis identified the SPEARmetal index as the primary indicator associated with metal contamination in both water and sediment. These findings highlight the improved performance of the refined SPEARmetal index as a more sensitive and specific tool for assessing the ecological status of metal-impacted aquatic ecosystems compared to traditional indices. Full article
(This article belongs to the Section Water Quality and Contamination)
Show Figures

Figure 1

Figure 1
<p>Location map of sampling stations.</p>
Full article ">Figure 2
<p>Histogram of the metal sensitivity values for the SPEAR index. The black bar and solid line were the frequency of sensitivity values using the updated ecotoxicity dataset. The red bar and solid line were the frequency of sensitivity values in the existing SPEAR index for metal [<a href="#B15-water-16-03308" class="html-bibr">15</a>].</p>
Full article ">Figure 3
<p>Comparison of metal sensitivity values using the existing [<a href="#B15-water-16-03308" class="html-bibr">15</a>] and updated (this study) SPEAR indices in the upper Nakdong River. (<b>a</b>) Number of macroinvertebrate families in the upper Nakdong River and those with available ecotoxicity data. (<b>b</b>) Scatter plots comparing metal sensitivity values between the existing and updated indices. Black dashed lines represent the median metal sensitivity of the existing index. Red dotted lines indicate the main stream, and blue dotted lines indicate the tributaries.</p>
Full article ">Figure 4
<p>Site-specific relationships between contaminant concentrations, expressed as the sum of Hazard Quotients (HQs) or the mean Probable Effect Level quotient (mPELq), and ecological indices for macroinvertebrate community structure (TR, EPT, <math display="inline"><semantics> <msup> <mi mathvariant="normal">H</mi> <mo>′</mo> </msup> </semantics></math>, BMI, and <math display="inline"><semantics> <msub> <mi>SPEAR</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>t</mi> <mi>a</mi> <mi>l</mi> </mrow> </msub> </semantics></math>) in the upper Nakdong River. Scatter plots show: (<b>a</b>) <math display="inline"><semantics> <msub> <mi>SPEAR</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>t</mi> <mi>a</mi> <mi>l</mi> </mrow> </msub> </semantics></math> vs. <math display="inline"><semantics> <mrow> <mi mathvariant="sans-serif">Σ</mi> <mi>T</mi> <mi>U</mi> </mrow> </semantics></math>, (<b>b</b>) <math display="inline"><semantics> <msub> <mi>SPEAR</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>t</mi> <mi>a</mi> <mi>l</mi> </mrow> </msub> </semantics></math> vs. mPELq, (<b>c</b>) other ecological indices vs. <math display="inline"><semantics> <mrow> <mi mathvariant="sans-serif">Σ</mi> <mi>T</mi> <mi>U</mi> </mrow> </semantics></math>, (<b>d</b>) ther ecological indices vs. mPELq. TR: taxon richness, <math display="inline"><semantics> <msup> <mi mathvariant="normal">H</mi> <mo>′</mo> </msup> </semantics></math>: Shannon’s diversity index, EPT: Ephemeroptera, Plecoptera, and Trichoptera taxa richness, BMI: Benthic Macroinvertebrate Index. Linear relationships between the logarithm of <math display="inline"><semantics> <mrow> <mi mathvariant="sans-serif">Σ</mi> <mi>T</mi> <mi>U</mi> </mrow> </semantics></math> and SPEAR index calculated using the existing (SPEAR-2012) and newly updated (SPEAR-2024) taxon-specific metal sensitivity. The SPEAR-2012 values were calculated for the whole dataset, but SPEAR-2024 values were calculated for the whole dataset (SPEAR-2024t) and the main stream (SPEAR-2024s), separately.</p>
Full article ">Figure 5
<p>Site-specific relationship between individual metal concentrations and the SPEAR index for metals (SPEAR<sub><span class="html-italic">metal</span></sub>) in the main stream of the upper Nakdong River. (<b>a</b>) Hazard Quotients (HQs) of Zn and Cd in surface water, and (<b>b</b>) Probable Effect Level quotients (PELqs) of Zn, Cd, and As in sediment, are plotted against SPEAR<sub><span class="html-italic">metal</span></sub> values.</p>
Full article ">Figure 6
<p>Ordination plot for the principal component analysis of the ecological indices and environmental factors projected into the ordination model for the total samples from the main stream and tributaries. TR: taxon richness, H’: Shannon’s diversity index, EPT: Ephemeroptera, Plecoptera, and Trichoptera taxa richness, BMI: Benthic Macroinvertebrate Index, SPEAR-2012: SPEAR index for the existing metal sensitivity [<a href="#B15-water-16-03308" class="html-bibr">15</a>], SPEAR-2024s: SPEAR indices calculated in this study, sumHQ: the sum of the hazard quotients, mPELq: mean Probable Effect Level quotient, and Temp: temperature.</p>
Full article ">Figure A1
<p>Gradient of metal concentration (Cu, Zn, and Cd) and the SPEAR index for metals (<math display="inline"><semantics> <msub> <mi>SPEAR</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>t</mi> <mi>a</mi> <mi>l</mi> </mrow> </msub> </semantics></math>) (<b>a</b>) Main stream sites and (<b>b</b>) tributary sites in the upper Nakdong River.</p>
Full article ">Figure A2
<p>Relationship of the sum of toxic units and SPEAR index for metals and predator ratio [<a href="#B16-water-16-03308" class="html-bibr">16</a>].</p>
Full article ">
19 pages, 1160 KiB  
Article
Enhancing Explainable Recommendations: Integrating Reason Generation and Rating Prediction through Multi-Task Learning
by Xingyu Zhu, Xiaona Xia, Yuheng Wu and Wenxu Zhao
Appl. Sci. 2024, 14(18), 8303; https://doi.org/10.3390/app14188303 - 14 Sep 2024
Viewed by 1175
Abstract
In recent years, recommender systems—which provide personalized recommendations by analyzing users’ historical behavior to infer their preferences—have become essential tools across various domains, including e-commerce, streaming media, and social platforms. Recommender systems play a crucial role in enhancing user experience by mining vast [...] Read more.
In recent years, recommender systems—which provide personalized recommendations by analyzing users’ historical behavior to infer their preferences—have become essential tools across various domains, including e-commerce, streaming media, and social platforms. Recommender systems play a crucial role in enhancing user experience by mining vast amounts of data to identify what is most relevant to users. Among these, deep learning-based recommender systems have demonstrated exceptional recommendation performance. However, these “black-box” systems lack reasonable explanations for their recommendation results, which reduces their impact and credibility. To address this situation, an effective strategy is to provide a personalized textual explanation along with the recommendation. This approach has received increasing attention from researchers because it can enhance users’ trust in recommender systems through intuitive explanations. In this context, our paper introduces a novel explainable recommendation model named GCLTE. This model integrates Graph Contrastive Learning with transformers within an Encoder–Decoder framework to perform rating prediction and reason generation simultaneously. In addition, we cleverly combine the neural network layer with the transformer using a straightforward information enhancement operation. Finally, our extensive experiments on three real-world datasets demonstrate the effectiveness of GCLTE in both recommendation and explanation. The experimental results show that our model outperforms the top existing models. Full article
Show Figures

Figure 1

Figure 1
<p>GCLTE: integrating Graph Contrastive Learning and transformers for rating prediction and reason generation. In the figure, “bos” represents the marker indicating the beginning of the input sequence.</p>
Full article ">Figure 2
<p>Representation of final layer embedding.</p>
Full article ">Figure 3
<p>Adding noise perturbation to node vectors.</p>
Full article ">Figure 4
<p>Effects of hyperparameter settings on performance. (<b>a</b>) Effects of hyperparameter settings on rating prediction performance. (<b>b</b>) Effects of hyperparameter settings on reason generation performance.</p>
Full article ">
19 pages, 7689 KiB  
Article
Development of High-Silica Adakitic Intrusions in the Northern Appalachians of New Brunswick (Canada), and Their Correlation with Slab Break-Off: Insights into the Formation of Fertile Cu-Au-Mo Porphyry Systems
by Fazilat Yousefi, David R. Lentz, James A. Walker and Kathleen G. Thorne
Geosciences 2024, 14(9), 241; https://doi.org/10.3390/geosciences14090241 - 7 Sep 2024
Cited by 1 | Viewed by 892
Abstract
High-silica adakites exhibit specific compositions, as follows: SiO2 ≥ 56 wt.%, Al2O3 ≥ 15 wt.%, Y ≤ 18 ppm, Yb ≤ 1.9 ppm, K2O/Na2O ≥ 1, MgO < 3 wt.%, high Sr/Y (≥10), and La/Yb [...] Read more.
High-silica adakites exhibit specific compositions, as follows: SiO2 ≥ 56 wt.%, Al2O3 ≥ 15 wt.%, Y ≤ 18 ppm, Yb ≤ 1.9 ppm, K2O/Na2O ≥ 1, MgO < 3 wt.%, high Sr/Y (≥10), and La/Yb (>10). Devonian I-type adakitic granitoids in the northern Appalachians of New Brunswick (NB, Canada) share geochemical signatures of adakites elsewhere, i.e., SiO2 ≥ 66.46 wt.%, Al2O3 > 15.47 wt.%, Y ≤ 22 ppm, Yb ≤ 2 ppm, K2O/Na2O > 1, MgO < 3 wt.%, Sr/Y ≥ 33 to 50, and La/Yb > 10. Remarkably, adakitic intrusions in NB, including the Blue Mountain Granodiorite Suite, Nicholas Denys, Sugar Loaf, Squaw Cap, North Dungarvan River, Magaguadavic Granite, Hampstead Granite, Tower Hill, Watson Brook Granodiorite, Rivière-Verte Porphyry, Eagle Lake Granite, Evandale Granodiorite, North Pole Stream Suite, and the McKenzie Gulch porphyry dykes all have associated Cu mineralization, similar to the Middle Devonian Cu porphyry intrusions in Mines Gaspé, Québec. Trace element data support the connection between adakite formation and slab break-off, a mechanism influencing fertility and generation of porphyry Cu systems. These adakitic rocks in NB are oxidized, and are relatively enriched in large ion lithophile elements, like Cs, Rb, Ba, and Pb, and depleted in some high field strength elements, like Y, Nb, Ta, P, and Ti; they also have Sr/Y ≥ 33 to 50, Nb/Y > 0.4, Ta/Yb > 0.3, La/Yb > 10, Ta/Yb > 0.3, Sm/Yb > 2.5, Gd/Yb > 2.0, Nb + Y < 60 ppm, and Ta + Yb < 6 ppm. These geochemical indicators point to failure of a subducting oceanic slab (slab rollback to slab break-off) in the terminal stages of subduction, as the generator of post-collisional granitoid magmatism. The break-off and separation of a dense subducted oceanic plate segment leads to upwelling asthenosphere, heat advection, and selective partial melting of the descending oceanic slab (adakite) and (or) suprasubduction zone lithospheric mantle. The resulting silica-rich adakitic magmas ascend through thickened mantle lithosphere, with minimal affect from the asthenosphere. The critical roles of transpression and transtension are highlighted in facilitating the ascent and emplacement of these fertile adakitic magmas in postsubduction zone settings. Full article
(This article belongs to the Special Issue Zircon U-Pb Geochronology Applied to Tectonics and Ore Deposits)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Major tectonic zones of the Canadian Appalachians; (<b>b</b>) Tectonic zones and cover sequences of New Brunswick (modified from [<a href="#B27-geosciences-14-00241" class="html-bibr">27</a>]).</p>
Full article ">Figure 2
<p>Regional map of the New Brunswick Appalachians, showing the location of Devonian mafic-to-felsic granitoids and major faults (modified from [<a href="#B28-geosciences-14-00241" class="html-bibr">28</a>]).</p>
Full article ">Figure 3
<p>Geochemical discrimination diagrams for adakitic samples investigated: (<b>a</b>) SiO<sub>2</sub> vs. Na<sub>2</sub>O + K<sub>2</sub>O discrimination diagram. Field boundaries from Cox et al. [<a href="#B32-geosciences-14-00241" class="html-bibr">32</a>]; (<b>b</b>) SiO<sub>2</sub> vs. K<sub>2</sub>O discrimination diagram with field boundaries from [<a href="#B33-geosciences-14-00241" class="html-bibr">33</a>]; (<b>c</b>) Al<sub>2</sub>O<sub>3</sub>/(CaO + K<sub>2</sub>O + Na<sub>2</sub>O) (A/CNK) vs. Al<sub>2</sub>O<sub>3</sub>/(Na<sub>2</sub>O + K<sub>2</sub>O) (A/NK) diagram modified from [<a href="#B34-geosciences-14-00241" class="html-bibr">34</a>]. The line with an amount of A/CNK = 1.1 is a key parameter to discriminate S- from I-type granites [<a href="#B35-geosciences-14-00241" class="html-bibr">35</a>]; (<b>d</b>) FeOt/(FeOt + MgO) vs. SiO<sub>2</sub> discrimination diagram with field boundaries from [<a href="#B36-geosciences-14-00241" class="html-bibr">36</a>].</p>
Full article ">Figure 4
<p>(<b>a</b>) (La/Yb)<sub>N</sub> vs. (Yb)<sub>N</sub> discrimination diagram with field boundaries from [<a href="#B37-geosciences-14-00241" class="html-bibr">37</a>]; (<b>b</b>) Sr/Y vs. Y discrimination diagram with field boundaries from [<a href="#B37-geosciences-14-00241" class="html-bibr">37</a>]; (<b>c</b>) SiO<sub>2</sub> vs. MgO discrimination diagram for high- and low-silica adakite; (<b>d</b>) primitive mantle-normalized extended element spider diagram. Symbols are the same as <a href="#geosciences-14-00241-f003" class="html-fig">Figure 3</a>. Normalized factors are from [<a href="#B38-geosciences-14-00241" class="html-bibr">38</a>]. TTG = tonalite–trondhjemite–granodiorite, ADR = andesite–dacite–rhyolite.</p>
Full article ">Figure 5
<p>Harker diagrams of Devonian adakitic rocks of NB. SiO<sub>2</sub> vs. (<b>a</b>) TiO<sub>2</sub>, (<b>b</b>) Al<sub>2</sub>O<sub>3</sub>, (<b>c</b>) Ni, and (<b>d</b>) Co. The same symbols as <a href="#geosciences-14-00241-f003" class="html-fig">Figure 3</a> are used. The arrows indicate a general fractionation trend towards high silica.</p>
Full article ">Figure 6
<p>Geochemical discrimination diagrams. (<b>a</b>) FeOt/MgO vs. Zr + Nb + Ce + Y (ppm) and (<b>b</b>) Zr + Nb + Ce + Y (ppm) vs. (Na<sub>2</sub>O + K<sub>2</sub>O)/CaO. Field boundaries are from [<a href="#B40-geosciences-14-00241" class="html-bibr">40</a>]. A-type: A-type granite, FG: fractionated granite rocks, OTG: unfractionated granite/other type of granite.</p>
Full article ">Figure 7
<p>Tectonomagmatic discrimination diagrams for differentiating among slab failure, arc, and A-type granites applied to the New Brunswick granites investigated. (<b>a</b>) Nb + Y vs. Ta/Yb; (<b>b</b>) Ta + Yb vs. Ta/Yb; (<b>c</b>) Nb + Y vs. La/Yb; (<b>d</b>) Ta + Yb vs. Sm/Yb; (<b>e</b>) Nb + Y vs. Gd/Yb; (<b>f</b>) Ta + Yb vs. Gd/Yb; (<b>g</b>) Nb + Y vs. Nb/Y; (<b>h</b>) Ta + Yb vs. Nb/Y. All field boundaries are from [<a href="#B48-geosciences-14-00241" class="html-bibr">48</a>,<a href="#B50-geosciences-14-00241" class="html-bibr">50</a>,<a href="#B51-geosciences-14-00241" class="html-bibr">51</a>], respectively.</p>
Full article ">Figure 8
<p>Continuation of tectonomagmatic discrimination diagrams. (<b>a</b>) Gd/Yb vs. La/Yb; (<b>b</b>) Sm/Yb vs. La/Sm; (<b>c</b>) Ta + Yb vs. Rb; (<b>d</b>) Nb + Y vs. Rb; (<b>e</b>) Y vs. Nb; (<b>f</b>) Yb vs. Ta. Symbols as in <a href="#geosciences-14-00241-f007" class="html-fig">Figure 7</a>.</p>
Full article ">Figure 9
<p>Tectonic discrimination diagrams for adakitic rocks investigated in this study. (<b>a</b>) Nb/Yb vs. Th/Yb, and (<b>b</b>) TiO<sub>2</sub>/Yb vs. Nb/Yb. Field boundaries are from [<a href="#B53-geosciences-14-00241" class="html-bibr">53</a>]. MORB: mid-ocean ridge basalt, OIB: ocean island basalt, Th: tholeiite, Alk: alkaline, EMORB: enriched mid-ocean ridge basalt, NMORB: normal mid-ocean ridge. Symbols as in <a href="#geosciences-14-00241-f007" class="html-fig">Figure 7</a>.</p>
Full article ">Figure 10
<p>Discrimination diagrams for the determination of magmatic source rocks for adakites in New Brunswick. (<b>a</b>) MgO (wt.%) vs. SiO<sub>2</sub> (wt.%), and (<b>b</b>) Mg<sup>#</sup> vs. SiO<sub>2</sub> (wt.%) diagrams for determining the effective factors in creating these adakitic magmas. Symbols as in <a href="#geosciences-14-00241-f007" class="html-fig">Figure 7</a>. Field boundaries are from [<a href="#B54-geosciences-14-00241" class="html-bibr">54</a>].</p>
Full article ">Figure 11
<p>Tectonic discrimination diagram for New Brunswick adakites. Field boundaries are from [<a href="#B55-geosciences-14-00241" class="html-bibr">55</a>]. Hb: hornblende, An: anorthite, Ab: albite, En: enstatite, Fa: fayalite, Fo: forsterite, Bt: biotite, Fs: feldspar, Sp: sphene (titanite), Hd: hedenbergite, Ha: haapalaite, and Di: diopside. Symbols as in <a href="#geosciences-14-00241-f007" class="html-fig">Figure 7</a>.</p>
Full article ">Figure 12
<p>Schematic model showing the Silurian–Carboniferous tectonic evolution of the northern Appalachian orogen, and the generation of slab break-off-generated magmas; (<b>a</b>) late Silurian–Early Devonian, and (<b>b</b>) Middle Devonian–Early Carboniferous. Modified from [<a href="#B66-geosciences-14-00241" class="html-bibr">66</a>].</p>
Full article ">
18 pages, 10594 KiB  
Article
A Framework for Characterizing Spatio-Temporal Variation of Turbidity and Drivers in the Navigable and Turbid River: A Case Study of Xitiaoxi River
by Min Zhang, Renhua Yan, Junfeng Gao, Suding Yan and Jialong Yan
Water 2024, 16(17), 2503; https://doi.org/10.3390/w16172503 - 3 Sep 2024
Viewed by 863
Abstract
Turbidity, as a key indicator of water quality linked to underwater light attenuation, is crucial for evaluating water quality. Control in high-turbidity water environments plays a critical role in navigable rivers. For this purpose, our study proposed a framework for analyzing the spatio-temporal [...] Read more.
Turbidity, as a key indicator of water quality linked to underwater light attenuation, is crucial for evaluating water quality. Control in high-turbidity water environments plays a critical role in navigable rivers. For this purpose, our study proposed a framework for analyzing the spatio-temporal variation of turbidity and its driving factors in a navigable and turbid river using in situ measurement data, satellite data, socioeconomic data, a power index function model, and correlation analysis. The results show that the proposed model is feasible for quantitative turbidity monitoring of the Xitiaoxi River. Its upstream turbidity is lower than downstream, with seasonal averages for spring, summer, autumn, and winter of 93.9, 111.3, 113.5, and 120.9 NTU, respectively. Furthermore, the turbidity in the middle and lower reaches of the Xitiaoxi River continuously increased before 2005 and began to decline after 2005 due to the policy of mining moratorium. This trend is especially noticeable at monitoring points along the main stream of the Xitiaoxi River, such as downstream of the Xitiaoxi River (S1), Gangkou station (S2), middle reaches of the Xitiaoxi River (S4), Hengtangcun station (S6), upper stream of the Xitiaoxi River (S7), and Huxi River (S8). Mining and shipping have significantly contributed to the turbidity of the target river. This framework offers a practical approach for assessing the environmental impacts of both natural and anthropogenic factors, thereby providing valuable insights for river management practices. Full article
Show Figures

Figure 1

Figure 1
<p>Location of the study area and sampling sites with photos of shipping and mining.</p>
Full article ">Figure 2
<p>Framework for quantifying the spatio-temporal variation in turbidity and its drivers in the navigable and turbid river.</p>
Full article ">Figure 3
<p>Field measurement results of turbidity from September 2020 to July 2021.</p>
Full article ">Figure 4
<p>Models based on B2, B3 + B4, B4.</p>
Full article ">Figure 5
<p>Comparison of turbidity between in situ measurement data and model-inversed data.</p>
Full article ">Figure 6
<p>Average annual turbidity distribution from 1984 to 2022.</p>
Full article ">Figure 7
<p>Seasonal distribution of turbidity from 1984 to 2022.</p>
Full article ">Figure 8
<p>Turbidity distribution interval proportion.</p>
Full article ">Figure 9
<p>Annual variation in turbidity in Xitiaoxi River over the past 40 years.</p>
Full article ">Figure 10
<p>Correlation between turbidity and sediment discharge at Gangkou station (S2).</p>
Full article ">Figure 11
<p>Variation in NDVI from 1984 to 2022.</p>
Full article ">Figure 12
<p>Correlation between turbidity and influential factors (POP, NOSV, and TOVMI represent for resident population, number of sailing vessels, and total output value of the mining industry, respectively).</p>
Full article ">Figure 13
<p>Trend of the number of sailing vessels (<b>left</b>) and total output value of the mining industry (<b>right</b>) in Xitiaoxi River.</p>
Full article ">
24 pages, 2073 KiB  
Review
Overview of Wind and Photovoltaic Data Stream Classification and Data Drift Issues
by Xinchun Zhu, Yang Wu, Xu Zhao, Yunchen Yang, Shuangquan Liu, Luyi Shi and Yelong Wu
Energies 2024, 17(17), 4371; https://doi.org/10.3390/en17174371 - 1 Sep 2024
Viewed by 857
Abstract
The development in the fields of clean energy, particularly wind and photovoltaic power, generates a large amount of data streams, and how to mine valuable information from these data to improve the efficiency of power generation has become a hot spot of current [...] Read more.
The development in the fields of clean energy, particularly wind and photovoltaic power, generates a large amount of data streams, and how to mine valuable information from these data to improve the efficiency of power generation has become a hot spot of current research. Traditional classification algorithms cannot cope with dynamically changing data streams, so data stream classification techniques are particularly important. The current data stream classification techniques mainly include decision trees, neural networks, Bayesian networks, and other methods, which have been applied to wind power and photovoltaic power data processing in existing research. However, the data drift problem is gradually highlighted due to the dynamic change in data, which significantly impacts the performance of classification algorithms. This paper reviews the latest research on data stream classification technology in wind power and photovoltaic applications. It provides a detailed introduction to the data drift problem in machine learning, which significantly affects algorithm performance. The discussion covers covariate drift, prior probability drift, and concept drift, analyzing their potential impact on the practical deployment of data stream classification methods in wind and photovoltaic power sectors. Finally, by analyzing examples for addressing data drift in energy-system data stream classification, the article highlights the future prospects of data drift research in this field and suggests areas for improvement. Combined with the systematic knowledge of data stream classification techniques and data drift handling presented, it offers valuable insights for future research. Full article
(This article belongs to the Special Issue Advances in Renewable Energy Power Forecasting and Integration)
Show Figures

Figure 1

Figure 1
<p>Installed solar and wind energy capacity.</p>
Full article ">Figure 2
<p>An example of covariate drift.</p>
Full article ">Figure 3
<p>An example of prior probability drift.</p>
Full article ">Figure 4
<p>(<b>a</b>) An example of feature drift; (<b>b</b>) an example of instance drift.</p>
Full article ">Figure 5
<p>(<b>a</b>) An example of abrupt drift; (<b>b</b>) An example of gradual drift.</p>
Full article ">
17 pages, 24301 KiB  
Article
Hydrodynamic Model of the Area of the Żelazny Most Mining Waste Storage Facility to Reconstruct the Migration of Saline Groundwater
by Jacek Gurwin, Marek Wcisło, Stanisław Staśko, Sebastian Buczyński, Magdalena Modelska, Tomasz Olichwer and Robert Tarka
Water 2024, 16(17), 2431; https://doi.org/10.3390/w16172431 - 28 Aug 2024
Viewed by 787
Abstract
This paper presents the construction of a numerical three-dimensional model of the area of the Żelazny Most Mining Waste Storage Facility (MWSF). In the study area, the difficult geological conditions associated with glaciotectonics are accompanied by a complex hydrotechnical system of sediment deposition [...] Read more.
This paper presents the construction of a numerical three-dimensional model of the area of the Żelazny Most Mining Waste Storage Facility (MWSF). In the study area, the difficult geological conditions associated with glaciotectonics are accompanied by a complex hydrotechnical system of sediment deposition and sedimentary water drainage. In order to effectively reflect the water flow paths, a detailed schematization was carried out, using 700,000 boreholes and more than 300 hydrogeological cross-sections. In addition, numerous drainage sections, streams, and ditches were included to reliably assess the amount of saline water entering the underlying aquifers. This research was supported by magnetic resonance sounding (MRS) studies of the reservoir’s sediments. The MWSF is currently being expanded, so the work primarily focuses on illustrating changes in the hydrodynamic field resulting from the inclusion of the new southern section. Models of similar facilities have been implemented before, but in the current one, the combination of meticulous analysis of the hydro-structural system, the water balance, a significant amount of data, the size of the facility, and the use of an unstructured discretization grid in the calculations is undoubtedly innovative and will be an important contribution to the development of analogous solutions around the world. Full article
(This article belongs to the Special Issue Groundwater Monitoring, Assessment and Modelling)
Show Figures

Figure 1

Figure 1
<p>Location map of Żelazny Most Mining Waste Storage Facility (<b>a</b>), with satellite image (<b>b</b>).</p>
Full article ">Figure 2
<p>The variation in the hydraulic conductivity of sediments within the Żelazny Most landfill, specifically in the cross-section transitioning from the western dam to the eastern dam [<a href="#B30-water-16-02431" class="html-bibr">30</a>] (with permission from KGHM company).</p>
Full article ">Figure 3
<p>Hydrogeological schematization on an exemplary XIXaS cross-section from the hydrogeological documentation [<a href="#B38-water-16-02431" class="html-bibr">38</a>] (with permission from KGHM company), along with schematic lines of the available geological cross-sections against the range of the numerical model.</p>
Full article ">Figure 4
<p>Thickness maps of the layers of the hydrogeological model.</p>
Full article ">Figure 5
<p>Spatial distribution of the hydraulic conductivity of the 3rd (<b>a</b>) and 5th (<b>b</b>) layers of the hydrogeological model.</p>
Full article ">Figure 6
<p>Maps of the bottom of layer 2 (<b>a</b>) and layer 3 (<b>b</b>) in the model grid.</p>
Full article ">Figure 7
<p>Discretization grid with introduced boundary conditions.</p>
Full article ">Figure 8
<p>Results of MRS studies.</p>
Full article ">Figure 9
<p>Computed vs. observed head values.</p>
Full article ">Figure 10
<p>Head contour map according to calibration state for 2019.</p>
Full article ">Figure 11
<p>Head contour map from model forecast simulation for 2026.</p>
Full article ">
32 pages, 4130 KiB  
Article
An Adaptive Active Learning Method for Multiclass Imbalanced Data Streams with Concept Drift
by Meng Han, Chunpeng Li, Fanxing Meng, Feifei He and Ruihua Zhang
Appl. Sci. 2024, 14(16), 7176; https://doi.org/10.3390/app14167176 - 15 Aug 2024
Viewed by 930
Abstract
Learning from multiclass imbalanced data streams with concept drift and variable class imbalance ratios under a limited label budget presents new challenges in the field of data mining. To address these challenges, this paper proposes an adaptive active learning method for multiclass imbalanced [...] Read more.
Learning from multiclass imbalanced data streams with concept drift and variable class imbalance ratios under a limited label budget presents new challenges in the field of data mining. To address these challenges, this paper proposes an adaptive active learning method for multiclass imbalanced data streams with concept drift (AdaAL-MID). Firstly, a dynamic label budget strategy under concept drift scenarios is introduced, which allocates label budgets reasonably at different stages of the data stream to effectively handle concept drift. Secondly, an uncertainty-based label request strategy using a dual-margin dynamic threshold matrix is designed to enhance learning opportunities for minority class instances and those that are challenging to classify, and combined with a random strategy, it can estimate the current class imbalance distribution by accessing only a limited number of instance labels. Finally, an instance-adaptive sampling strategy is proposed, which comprehensively considers the imbalance ratio and classification difficulty of instances, and combined with a weighted ensemble strategy, improves the classification performance of the ensemble classifier in imbalanced data streams. Extensive experiments and analyses demonstrate that AdaAL-MID can handle various complex concept drifts and adapt to changes in class imbalance ratios, and it outperforms several state-of-the-art active learning algorithms. Full article
Show Figures

Figure 1

Figure 1
<p>Multiclass imbalanced data streams with concept drift. (The gray shapes and lines represent instances and decision boundaries at time <span class="html-italic">t</span>, while the red ones indicate the state at time <span class="html-italic">t</span> + 1 after concept drift. Different shapes represent instances of different classes).</p>
Full article ">Figure 2
<p>The framework of the AdaAL-MID.</p>
Full article ">Figure 3
<p>Effect of parameter variation on the AdaAl-MID’s performance: (<b>a</b>) changes in <span class="html-italic">N</span> on Fixed_MIX_SIG; (<b>b</b>) changes in <span class="html-italic">sizeW</span> on Fixed_MIX_SIG; (<b>c</b>) changes in <span class="html-italic">θ</span><sub>1</sub> on Fixed_MIX_SIG; (<b>d</b>) changes in <span class="html-italic">θ</span><sub>2</sub> on Fixed_MIX_SIG; (<b>e</b>) changes in <span class="html-italic">N</span> on Var_MIX_SIG; (<b>f</b>) changes in <span class="html-italic">sizeW</span> on Var_MIX_SIG; (<b>g</b>) changes in <span class="html-italic">θ</span><sub>1</sub> on Var_MIX_SIG; (<b>h</b>) changes in <span class="html-italic">θ</span><sub>2</sub> on Var_MIX_SIG.</p>
Full article ">Figure 4
<p>The recall curves of the comparison algorithms on fixed class proportion data streams.</p>
Full article ">Figure 5
<p>The recall curves of the comparison algorithms for the variable class proportion data streams.</p>
Full article ">Figure 6
<p>Recall curves of the comparison algorithms on the real data streams.</p>
Full article ">Figure 7
<p>Wind rose diagrams on three data streams: (<b>a</b>) data stream with fixed class proportions; (<b>b</b>) data stream with fixed class proportions; (<b>c</b>) real data stream.</p>
Full article ">Figure 8
<p>Radar charts for 24 data streams.</p>
Full article ">Figure 9
<p>The number of times AdaAL-MID and the comparison algorithms achieved wins (green), advantageous ties (light green), disadvantageous ties (yellow), and losses (red) on 24 data streams.</p>
Full article ">Figure 10
<p>BD test results on five metrics: (<b>a</b>) recall; (<b>b</b>) accuracy; (<b>c</b>) G-mean; (<b>d</b>) F1-score; (<b>e</b>) kappa.</p>
Full article ">Figure 11
<p>Recall curves of the AdaAL-MID ablation study: (<b>a</b>) Fixed_Mix_SIG; (<b>b</b>) Var_Mix_SIG; (<b>c</b>) connect-4.</p>
Full article ">
29 pages, 3714 KiB  
Article
Variance Feedback Drift Detection Method for Evolving Data Streams Mining
by Meng Han, Fanxing Meng and Chunpeng Li
Appl. Sci. 2024, 14(16), 7157; https://doi.org/10.3390/app14167157 - 15 Aug 2024
Viewed by 763
Abstract
Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification [...] Read more.
Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification efficiency of the original classifier, that is, the old decision-making model is not suitable for the new data environment. Therefore, dealing with concept drift from changing data streams is crucial to guarantee classifier performance. Currently, most concept drift detection methods apply the same detection strategy to different data streams, with little attention to the uniqueness of each data stream. This limits the adaptability of drift detectors to different environments. In our research, we designed a unique solution to address this issue. First, we proposed a variance estimation strategy and a variance feedback strategy to characterize the data stream’s characteristics through variance. Based on this variance, we developed personalized drift detection schemes for different data streams, thereby enhancing the adaptability of drift detection in various environments. We conducted experiments on data streams with various types of drifts. The experimental results show that our algorithm achieves the best average ranking for accuracy on the synthetic dataset, with an overall ranking 1.12 to 1.5 higher than the next-best algorithm. In comparison with algorithms using the same tests, our method improves the ranking by 3 to 3.5 for the Hoeffding test and by 1.12 to 2.25 for the McDiarmid test. In addition, they achieve a good balance between detection delay and false positive rates. Finally, our algorithm ranks higher than existing drift detection methods across the four key metrics of accuracy, CPU time, false positives, and detection delay, meeting our expectations. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>The difference between virtual concept drift and real concept drift (The dashed line represents the decision boundary).</p>
Full article ">Figure 2
<p>The process of data distribution changes in four types of concept drift.</p>
Full article ">Figure 3
<p>Variance sample collection process (Including the evaluation of variance sampling conditions and the sampling process).</p>
Full article ">Figure 4
<p>Workflow of Variance estimation (Including the sampling process and the variance estimation process).</p>
Full article ">Figure 5
<p>Workflow of Variance feedback (Including weight generation, mean generation, and statistical test selection).</p>
Full article ">Figure 6
<p>Workflow of VFDDM (Including variance estimation, variance feedback, and drift detection stages).</p>
Full article ">Figure 7
<p>The accuracy trend of NB using different drift detectors on the synthetic datasets.</p>
Full article ">Figure 8
<p>The accuracy trend of HT using different drift detectors on the synthetic datasets.</p>
Full article ">Figure 8 Cont.
<p>The accuracy trend of HT using different drift detectors on the synthetic datasets.</p>
Full article ">Figure 9
<p>The ranking frequency of accuracy and CPU time for base classifiers using different drift detectors on the synthetic datasets.</p>
Full article ">Figure 10
<p>The ranking frequency of detection delay and false positives for different drift detectors.</p>
Full article ">Figure 11
<p>The overall ranking of different drift detectors based on four key metrics: accuracy, CPU time, detection delay, and false positives.</p>
Full article ">
24 pages, 16296 KiB  
Article
Improving Mineral Classification Using Multimodal Hyperspectral Point Cloud Data and Multi-Stream Neural Network
by Aldino Rizaldy, Ahmed Jamal Afifi, Pedram Ghamisi and Richard Gloaguen
Remote Sens. 2024, 16(13), 2336; https://doi.org/10.3390/rs16132336 - 26 Jun 2024
Viewed by 2072
Abstract
In this paper, we leverage multimodal data to classify minerals using a multi-stream neural network. In a previous study on the Tinto dataset, which consisted of a 3D hyperspectral point cloud from the open-pit mine Corta Atalaya in Spain, we successfully identified mineral [...] Read more.
In this paper, we leverage multimodal data to classify minerals using a multi-stream neural network. In a previous study on the Tinto dataset, which consisted of a 3D hyperspectral point cloud from the open-pit mine Corta Atalaya in Spain, we successfully identified mineral classes by employing various deep learning models. However, this prior work solely relied on hyperspectral data as input for the deep learning models. In this study, we aim to enhance accuracy by incorporating multimodal data, which includes hyperspectral images, RGB images, and a 3D point cloud. To achieve this, we have adopted a graph-based neural network, known for its efficiency in aggregating local information, based on our past observations where it consistently performed well across different hyperspectral sensors. Subsequently, we constructed a multi-stream neural network tailored to handle multimodality. Additionally, we employed a channel attention module on the hyperspectral stream to fully exploit the spectral information within the hyperspectral data. Through the integration of multimodal data and a multi-stream neural network, we achieved a notable improvement in mineral classification accuracy: 19.2%, 4.4%, and 5.6% on the LWIR, SWIR, and VNIR datasets, respectively. Full article
(This article belongs to the Special Issue Remote Sensing for Geology and Mapping)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) False-color visualization of the LWIR hypercloud data at 10,114, 9181, and 8545 nm, (<b>b</b>) the corresponding semantic label, (<b>c</b>) training (red) and testing (blue) sets, and (<b>d</b>) the class distribution of training (red) and testing (blue) sets.</p>
Full article ">Figure 2
<p>Illustration of the multi-stream network. The architecture consists of 3 individual streams in the earlier part of the network, which is followed by a typical single-stream network with skip connection links (illustrated as blue arrows) to bring the information from the earlier layers to the latter. The 3 individual streams are designed to work with 3 different modalities. Those are hyperspectral, RGB, and geometrical data. The channel-wise concatenation is utilized for merging the output of the 3 different streams. It is also used for concatenating the skip connection links. Graph reconstruction, MLP, and max-pooling layers are stacked together as the features encoder block. This block is utilized in each stream separately and then repeated three times for deeper layers. Finally, dense layers are applied in the last layers for the classification. Please note that height represents the number of features, width represents the number of <span class="html-italic">k</span>-NN points, and depth represents the number of points for each block.</p>
Full article ">Figure 3
<p>Schematic diagram of the 3D CNN block in the hyperspectral stream.</p>
Full article ">Figure 4
<p>(<b>Left</b>): Graph reconstruction using an image (<math display="inline"><semantics> <msub> <mi>x</mi> <mi>i</mi> </msub> </semantics></math> represents the central pixel, <math display="inline"><semantics> <msub> <mi>x</mi> <mi>j</mi> </msub> </semantics></math> represents the surrounding pixel, and <math display="inline"><semantics> <msub> <mi>e</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> </semantics></math> denotes the edge features). (<b>Right</b>): Illustration of the 2D-based network.</p>
Full article ">Figure 5
<p>The predicted point clouds of the LWIR dataset and the corresponding ground truth. (<b>a</b>) PointNet; (<b>b</b>) PointNet++; (<b>c</b>) PointCNN; (<b>d</b>) ConvPoint; (<b>e</b>) DGCNN; (<b>f</b>) Point Transformer; (<b>g</b>) PCT; (<b>h</b>) ours; (<b>i</b>) ground truth; (<b>j</b>) LWIR hypercloud.</p>
Full article ">Figure 6
<p>The predicted point clouds of SWIR dataset and the corresponding ground truth. (<b>a</b>) PointNet; (<b>b</b>) PointNet++; (<b>c</b>) PointCNN; (<b>d</b>) ConvPoint; (<b>e</b>) DGCNN; (<b>f</b>) Point Transformer; (<b>g</b>) PCT; (<b>h</b>) ours; (<b>i</b>) ground truth; (<b>j</b>) SWIR hypercloud.</p>
Full article ">Figure 7
<p>The predicted point clouds of VNIR dataset and the corresponding ground truth. (<b>a</b>) PointNet; (<b>b</b>) PointNet++; (<b>c</b>) PointCNN; (<b>d</b>) ConvPoint; (<b>e</b>) DGCNN; (<b>f</b>) Point Transformer; (<b>g</b>) PCT; (<b>h</b>) ours; (<b>i</b>) ground truth; (<b>j</b>) VNIR hypercloud.</p>
Full article ">Figure 8
<p>Image-based segmentation results using the SWIR dataset. (<b>a</b>) SWIR; (<b>b</b>) ground truth; (<b>c</b>) graph-network; (<b>d</b>) MLP.</p>
Full article ">Figure 9
<p>Different configuration of input features. (<b>a</b>) LWIR; (<b>b</b>) LWIR + RGB; (<b>c</b>) LWIR + RGB + geo; (<b>e</b>) SWIR; (<b>f</b>) SWIR + RGB; (<b>g</b>) SWIR + RGB + geo; (<b>d</b>,<b>h</b>) ground truth.</p>
Full article ">Figure 10
<p>The impact of multi-stream (MS) network. (<b>a</b>) LWIR without MS; (<b>b</b>) LWIR with MS; (<b>d</b>) SWIR without MS; (<b>e</b>) SWIR with MS; (<b>c</b>,<b>f</b>) ground truth.</p>
Full article ">Figure 11
<p>The impact of adding 3D CNN. (<b>a</b>) LWIR without 3DCNN; (<b>b</b>) LWIR with 3D CNN; (<b>d</b>) SWIR without 3DCNN; (<b>e</b>) SWIR with 3DCNN; (<b>c</b>,<b>f</b>) ground truth.</p>
Full article ">Figure 12
<p>The impact of transforming XYZ into geometric features. (<b>a</b>) LWIR with XYZ; (<b>b</b>) LWIR with geometric features; (<b>d</b>) SWIR with XYZ; (<b>e</b>) SWIR with geometric features; (<b>c</b>,<b>f</b>) ground truth.</p>
Full article ">
15 pages, 3882 KiB  
Article
A Dual-Stream Cross AGFormer-GPT Network for Traffic Flow Prediction Based on Large-Scale Road Sensor Data
by Yu Sun, Yajing Shi, Kaining Jia, Zhiyuan Zhang and Li Qin
Sensors 2024, 24(12), 3905; https://doi.org/10.3390/s24123905 - 17 Jun 2024
Viewed by 883
Abstract
Traffic flow prediction can provide important reference data for managers to maintain traffic order, and can also be based on personal travel plans for optimal route selection. On account of the development of sensors and data collection technology, large-scale road network historical data [...] Read more.
Traffic flow prediction can provide important reference data for managers to maintain traffic order, and can also be based on personal travel plans for optimal route selection. On account of the development of sensors and data collection technology, large-scale road network historical data can be effectively used, but their high non-linearity makes it meaningful to establish effective prediction models. In this regard, this paper proposes a dual-stream cross AGFormer-GPT network with prompt engineering for traffic flow prediction, which integrates traffic occupancy and speed as two prompts into traffic flow in the form of cross-attention, and uniquely mines spatial correlation and temporal correlation information through the dual-stream cross structure, effectively combining the advantages of the adaptive graph neural network and large language model to improve prediction accuracy. The experimental results on two PeMS road network data sets have verified that the model has improved by about 1.2% in traffic prediction accuracy under different road networks. Full article
(This article belongs to the Special Issue Feature Papers in the 'Sensor Networks' Section 2024)
Show Figures

Figure 1

Figure 1
<p>The structure of the prediction model.</p>
Full article ">Figure 2
<p>The structure of the prompt engineer.</p>
Full article ">Figure 3
<p>The structure of AGFormer.</p>
Full article ">Figure 4
<p>Regular fine-tuning and LoRA fine-tuning.</p>
Full article ">Figure 5
<p>The true value and prediction value on PeMSD4.</p>
Full article ">Figure 6
<p>The true value and prediction value on PeMSD8.</p>
Full article ">Figure 7
<p>The three error results of the three ablation models and the original model.</p>
Full article ">Figure 8
<p>The true value and prediction value on different space points.</p>
Full article ">Figure 8 Cont.
<p>The true value and prediction value on different space points.</p>
Full article ">Figure 9
<p>The visualization of real cases.</p>
Full article ">Figure 9 Cont.
<p>The visualization of real cases.</p>
Full article ">
19 pages, 331 KiB  
Article
An Efficient Probabilistic Algorithm to Detect Periodic Patterns in Spatio-Temporal Datasets
by Claudio Gutiérrez-Soto, Patricio Galdames and Marco A. Palomino
Big Data Cogn. Comput. 2024, 8(6), 59; https://doi.org/10.3390/bdcc8060059 - 3 Jun 2024
Viewed by 1113
Abstract
Deriving insight from data is a challenging task for researchers and practitioners, especially when working on spatio-temporal domains. If pattern searching is involved, the complications introduced by temporal data dimensions create additional obstacles, as traditional data mining techniques are insufficient to address spatio-temporal [...] Read more.
Deriving insight from data is a challenging task for researchers and practitioners, especially when working on spatio-temporal domains. If pattern searching is involved, the complications introduced by temporal data dimensions create additional obstacles, as traditional data mining techniques are insufficient to address spatio-temporal databases (STDBs). We hereby present a new algorithm, which we refer to as F1/FP, and can be described as a probabilistic version of the Minus-F1 algorithm to look for periodic patterns. To the best of our knowledge, no previous work has compared the most cited algorithms in the literature to look for periodic patterns—namely, Apriori, MS-Apriori, FP-Growth, Max-Subpattern, and PPA. Thus, we have carried out such comparisons and then evaluated our algorithm empirically using two datasets, showcasing its ability to handle different types of periodicity and data distributions. By conducting such a comprehensive comparative analysis, we have demonstrated that our newly proposed algorithm has a smaller complexity than the existing alternatives and speeds up the performance regardless of the size of the dataset. We expect our work to contribute greatly to the mining of astronomical data and the permanently growing online streams derived from social media. Full article
(This article belongs to the Special Issue Big Data and Information Science Technology)
21 pages, 9980 KiB  
Case Report
The Study of Groundwater in the Zhambyl Region, Southern Kazakhstan, to Improve Sustainability
by Dinara Adenova, Dani Sarsekova, Malis Absametov, Yermek Murtazin, Janay Sagin, Ludmila Trushel and Oxana Miroshnichenko
Sustainability 2024, 16(11), 4597; https://doi.org/10.3390/su16114597 - 29 May 2024
Cited by 5 | Viewed by 1724
Abstract
Water resources are scarce and difficult to manage in Kazakhstan, Central Asia (CA). Anthropic activities largely eliminated the Aral Sea. Afghanistan’s large-scale canal construction may eliminate life in the main stream of the Amu Darya River, CA. Kazakhstan’s HYRASIA ONE project, with a [...] Read more.
Water resources are scarce and difficult to manage in Kazakhstan, Central Asia (CA). Anthropic activities largely eliminated the Aral Sea. Afghanistan’s large-scale canal construction may eliminate life in the main stream of the Amu Darya River, CA. Kazakhstan’s HYRASIA ONE project, with a EUR 50 billion investment to produce green hydrogen, is targeted to withdraw water from the Caspian Sea. Kazakhstan, CA, requires sustainable programs that integrate both decision-makers’ and people’s behavior. For this paper, the authors investigated groundwater resources for sustainable use, including for consumption, and the potential for natural “white” hydrogen production from underground geological “factories”. Kazakhstan is rich in natural resources, such as iron-rich rocks, minerals, and uranium, which are necessary for serpentinization reactions and radiolysis decay in natural hydrogen production from underground water. Investigations of underground geological “factories” require substantial efforts in field data collection. A chemical analysis of 40 groundwater samples from the 97 wells surveyed and investigated in the T. Ryskulov, Zhambyl, Baizak and Zhualy districts of the Zhambyl region in South Kazakhstan in 2021–2022 was carried out. These samples were compared with previously collected water samples from the years 2020–2021. The compositions of groundwater samples were analyzed, revealing various concentrations of different minerals, natural geological rocks, and anthropogenic materials. South Kazakhstan is rich in natural mineral resources. As a result, mining companies extract resources in the Taraz–Zhanatas–Karatau and the Shu–Novotroitsk industrial areas. The most significant levels of minerals found in water samples were found in the territory of the Talas–Assinsky interfluve, where the main industrial mining enterprises are concentrated and the largest groundwater deposits have been explored. Groundwater compositions have direct connections to geological rocks. The geological rocks are confined to sandstones, siltstones, porphyrites, conglomerates, limestones, and metamorphic rocks. In observation wells, a number of components can be found in high concentrations (mg/L): sulfates—602.0 (MPC 500 mg/L); sodium—436.5 (MPC 200 mg/L); chlorine—465.4 (MPC 350 mg/L); lithium—0.18 (MPC 0.03 mg/L); boron—0.74 (MPC 0.5 mg/L); cadmium—0.002 (MPC 0.001 mg/L); strontium—15, 0 (MPC 7.0 mg/L); and TDS—1970 (MPC 1000). The high mineral contents in the water are natural and comprise minerals from geological sources, including iron-rich rocks, to uranium. Proper groundwater classifications for research investigations are required to separate potable groundwater resources, wells, and areas where underground geological “factories” producing natural “white” hydrogen could potentially be located. Our preliminary investigation results are presented with the aim of creating a large-scale targeted program to improve water sustainability in Kazakhstan, CA. Full article
Show Figures

Figure 1

Figure 1
<p>Zhambyl investigation region in Southern Kazakhstan.</p>
Full article ">Figure 2
<p>The hydrogeological map of the southern part of Zhambyl region, Kazakhstan.</p>
Full article ">Figure 3
<p>Self-flowing artesian wells found in the Ryskulov district, Zhambyl region, in 2022 (<b>a</b>–<b>c</b>).</p>
Full article ">Figure 4
<p>Frequency distribution of chemical components in wells (Colins histogram). Water Chemical samples analysis in Zhambyl, Baizak, T. Ryskulov and Zhualy districts of Zhambyl region, Kazakhstan.</p>
Full article ">Figure 5
<p>Piper plot depicting the chemical compositions of groundwater from in Zhambyl, Baizak, T. Ryskulov and Zhualy districts.</p>
Full article ">Figure 6
<p>Location of the wells with the exceedance of water quality standards are shown in red, including area with high Sr.</p>
Full article ">Figure 7
<p>Location of the wells with the exceedance of water quality standards are shown in red.</p>
Full article ">Figure 8
<p>Flowing wells location map in Baizak, Zhambyl, T. Ryskulov and Zhualy districts of Zhambyl region, Kazakhstan, 2022.</p>
Full article ">Figure 9
<p>The diagram exhibits the potential of Central Asian countries for cooperation in groundwater resource use to prospect natural geological “white” hydrogen production (adapted from USGS Ellis [<a href="#B63-sustainability-16-04597" class="html-bibr">63</a>].</p>
Full article ">
18 pages, 2278 KiB  
Article
Dynamics of River Flood Waves below Hydropower Dams and Their Relation to Natural Floods
by Robert E. Criss
Water 2024, 16(8), 1099; https://doi.org/10.3390/w16081099 - 11 Apr 2024
Viewed by 1491
Abstract
The dynamic behavior of flood waves on rivers is essential to flood prediction. Natural flood waves are complex due to tributary inputs, rainfall variations, and overbank flows, so this study examines hydropower dam releases, which are simpler to analyze because channel effects are [...] Read more.
The dynamic behavior of flood waves on rivers is essential to flood prediction. Natural flood waves are complex due to tributary inputs, rainfall variations, and overbank flows, so this study examines hydropower dam releases, which are simpler to analyze because channel effects are isolated. Successive arrival times and heights of peaks along 9 rivers with multiple stream gauges downstream of hydroelectric dams show that flow peaks typically become exponentially lower and wider with distance. The propagation velocity of peaks increases with water depth and channel slope but decreases with downstream distance and greater channel tortuosity. A rich hierarchy of velocities was found. Hydropower pulses progress at or in slight excess of the theoretical celerity, which is faster than the propagation rate of average natural floods, which in turn exceeds the mean velocity of water in the channel, yet the water moves faster than the peaks of record floods. The progressive changes to the height, shape, and velocity of hydropower flow peaks are simulated by the first analytical solution to the convolution integral for a rectangular source pulse that is based on diffusion-advection theory. Available data support some widely held expectations while refuting others. An expanded definition of “water mining” is proposed. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Map showing major rivers in the United States [<a href="#B28-water-16-01099" class="html-bibr">28</a>,<a href="#B29-water-16-01099" class="html-bibr">29</a>] and the location (red dots) of the river gauges listed in <a href="#water-16-01099-t001" class="html-table">Table 1</a>. The initial zero is omitted from several site numbers, and 3 sites cannot be labeled for clarity. (<b>b</b>) Digital elevation map (DEM) of Missouri [<a href="#B30-water-16-01099" class="html-bibr">30</a>] in the central United States, showing the Bagnell, Truman, and Clarence Cannon Dams on the Osage, Sac, and Salt Rivers, respectively, and the river gauges (blue dots; <a href="#water-16-01099-t001" class="html-table">Table 1</a>) below those dams. (<b>c</b>) A detailed map showing major roads, the Bagnell Dam, and the three river gauges (red dots) along the Osage River in Missouri that were selected for detailed study.</p>
Full article ">Figure 2
<p>(<b>A</b>) Graphs of flow vs. time for various choices of the time constant <span class="html-italic">b</span>, calculated with Equation (8a,b) for diffusion only. The curve for <span class="html-italic">b</span> = 0 represents the rectangular, unit flow pulse that issues from the source (dam), while the curves for increasing values of <span class="html-italic">b</span> simulate the flow variations at increasing distances downstream. (<b>B</b>) Flows calculated with Equation (11a,b) that incorporate both celerity and diffusion for different indicated distances.</p>
Full article ">Figure 3
<p>Propagation velocities (<span class="html-italic">V<sub>hp</sub></span>) of the 494 hydropower flow peaks for each of two reaches along the Osage River plotted against the observed, uncorrected, local stage of those flow peaks at the Tuscumbia gauge. Six points for the Osage City to Tuscumbia reach (X’s) are offscale.</p>
Full article ">Figure 4
<p>Comparison of the peak discharges at Tuscumbia (<span class="html-italic">Q<sub>tus</sub></span><sub>c</sub>) and St. Thomas (<span class="html-italic">Q<sub>StT</sub></span>), each normalized to the peak discharge at Osage City (<span class="html-italic">Q<sub>oc</sub></span>), all corrected for the minimum discharge at Osage City prior to each pulse event. The proportional losses (attenuation) at St. Thomas are greater than those at Tuscumbia. The solid curve shows the trend predicted by Equation (12d). Of the 494 pulse events selected for study, 20 are offscale.</p>
Full article ">Figure 5
<p>Peak attenuation at Tuscumbia (dots) and at St. Thomas (squares) relative to peak size at Osage City, plotted against the travel time of the peaks between each site and Osage City. Peak flows become more attenuated as travel time increases, as predicted by Equation (12a), assuming that the travel time is related to distance divided by celerity. Twenty pairs of points are offscale.</p>
Full article ">Figure 6
<p>Hydrographs at Osage City (red; 2.1 km; USGS 06926000), Tuscumbia (green; 24.6 km; USGS 06926080), and St. Thomas (blue; 75.9 km; USGS 06926510), driven by the 8 h, rectangular flow pulse of 22 May 2023 released from Bagnell Dam (black; 0 km; Ameren 2023). The USGS gauge data [<a href="#B27-water-16-01099" class="html-bibr">27</a>] are reported at 15 min intervals, but the Ameren data [<a href="#B37-water-16-01099" class="html-bibr">37</a>] are hourly. Note the progressive attenuation and delay of the peaks with distance downstream of Bagnell Dam. Compare these data with the theoretical predictions in <a href="#water-16-01099-f002" class="html-fig">Figure 2</a>B for similar thalweg distances and source pulse durations.</p>
Full article ">
20 pages, 740 KiB  
Article
Fair-CMNB: Advancing Fairness-Aware Stream Learning with Naïve Bayes and Multi-Objective Optimization
by Maryam Badar and Marco Fisichella
Big Data Cogn. Comput. 2024, 8(2), 16; https://doi.org/10.3390/bdcc8020016 - 31 Jan 2024
Viewed by 2073
Abstract
Fairness-aware mining of data streams is a challenging concern in the contemporary domain of machine learning. Many stream learning algorithms are used to replace humans in critical decision-making processes, e.g., hiring staff, assessing credit risk, etc. This calls for handling massive amounts of [...] Read more.
Fairness-aware mining of data streams is a challenging concern in the contemporary domain of machine learning. Many stream learning algorithms are used to replace humans in critical decision-making processes, e.g., hiring staff, assessing credit risk, etc. This calls for handling massive amounts of incoming information with minimal response delay while ensuring fair and high-quality decisions. Although deep learning has achieved success in various domains, its computational complexity may hinder real-time processing, making traditional algorithms more suitable. In this context, we propose a novel adaptation of Naïve Bayes to mitigate discrimination embedded in the streams while maintaining high predictive performance through multi-objective optimization (MOO). Class imbalance is an inherent problem in discrimination-aware learning paradigms. To deal with class imbalance, we propose a dynamic instance weighting module that gives more importance to new instances and less importance to obsolete instances based on their membership in a minority or majority class. We have conducted experiments on a range of streaming and static datasets and concluded that our proposed methodology outperforms existing state-of-the-art (SoTA) fairness-aware methods in terms of both discrimination score and balanced accuracy. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing in 2023)
Show Figures

Figure 1

Figure 1
<p>Illustration of proposed method (Fair-CMNB). (A) Prediction of new instance (<math display="inline"><semantics> <msub> <mi>x</mi> <mi>t</mi> </msub> </semantics></math>), (B) Discrimination detection, (C) Online class imbalance monitoring, (D) Instance weighting, (E) Concept drift detection, (F) Training of online nominal Naïve Bayes, (G) Training of online Gaussian Naïve Bayes, (H) Discrimination mitigation through multi-objective optimization (MOO).</p>
Full article ">Figure 2
<p>Comparison between balanced accuracy <math display="inline"><semantics> <mrow> <mi>B</mi> <mo>.</mo> <mi>A</mi> <mi>c</mi> <mi>c</mi> <mo>.</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>S</mi> <mi>t</mi> <mo>.</mo> <mi>P</mi> <mi>a</mi> <mi>r</mi> <mi>i</mi> <mi>t</mi> <mi>y</mi> </mrow> </semantics></math> values achieved by Fair-CMNB and FABBOO for Bank Marketing, Law School, and Default datasets. Notably, Fair-CMNB consistently outperforms FABBOO in terms of <math display="inline"><semantics> <mrow> <mi>B</mi> <mo>.</mo> <mi>A</mi> <mi>c</mi> <mi>c</mi> <mo>.</mo> </mrow> </semantics></math> throughout the stream for all datasets while maintaining very low <math display="inline"><semantics> <mrow> <mi>S</mi> <mi>t</mi> <mo>.</mo> <mi>P</mi> <mi>a</mi> <mi>r</mi> <mi>i</mi> <mi>t</mi> <mi>y</mi> </mrow> </semantics></math>.</p>
Full article ">Figure 3
<p>Impact of varying <math display="inline"><semantics> <mi>λ</mi> </semantics></math> on discrimination score (statistical parity) for Adult dataset.</p>
Full article ">
Back to TopTop