[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (531)

Search Parameters:
Keywords = distributed and parallel computing

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
32 pages, 4355 KiB  
Article
Optimizing Virtual Power Plants with Parallel Simulated Annealing on High-Performance Computing
by Ali Abbasi, Filipe Alves, Rui A. Ribeiro, João L. Sobral and Ricardo Rodrigues
Smart Cities 2025, 8(2), 47; https://doi.org/10.3390/smartcities8020047 - 12 Mar 2025
Abstract
This work focuses on optimizing the scheduling of virtual power plants (VPPs)—as implemented in the Portuguese national project New Generation Storage (NGS)—to maximize social welfare and enhance energy trading efficiency within modern energy grids. By integrating distributed energy resources (DERs), including renewable energy [...] Read more.
This work focuses on optimizing the scheduling of virtual power plants (VPPs)—as implemented in the Portuguese national project New Generation Storage (NGS)—to maximize social welfare and enhance energy trading efficiency within modern energy grids. By integrating distributed energy resources (DERs), including renewable energy sources and energy storage systems, VPPs represent a pivotal element of sustainable urban energy systems. The scheduling problem is formulated as a Mixed-Integer Linear Programming (MILP) task and addressed by using a parallelized simulated annealing (SA) algorithm implemented on high-performance computing (HPC) infrastructure. This parallelization accelerates solution space exploration, enabling the system to efficiently manage the complexity of larger DER networks and more sophisticated scheduling scenarios. The approach demonstrates its capability to align with the objectives of smart cities by ensuring adaptive and efficient energy distribution, integrating dynamic pricing mechanisms, and extending the operational lifespan of critical energy assets such as batteries. Rigorous simulations highlight the method’s ability to reduce optimization time, maintain solution quality, and scale efficiently, facilitating real-time decision making in energy markets. Moreover, the optimized coordination of DERs supports grid stability, enhances market responsiveness, and contributes to developing resilient, low-carbon urban environments. This study underscores the transformative role of computational infrastructure in addressing the challenges of modern energy systems, showcasing how advanced algorithms and HPC can enable scalable, adaptive, and sustainable energy optimization in smart cities. The findings demonstrate a pathway to achieving socially and environmentally responsible energy systems that align with the priorities of urban resilience and sustainable development. Full article
Show Figures

Figure 1

Figure 1
<p>A schematic of the VPP management framework, illustrating the core components and data flow among prosumers, the VPP market, and the grid. The VPP acts as an intermediary, facilitating energy trading under grid-regulated prices, with a central scheduler optimizing resource allocation.</p>
Full article ">Figure 2
<p>Diagram illustrating the process of search space reduction for each prosumer. Initially, the decision space is represented as a planar 2D space. Optimization simplifies this to a linear path confined by upper and lower bounds, enhancing computational efficiency and enabling the system to manage larger networks of prosumers effectively.</p>
Full article ">Figure 3
<p>A schematic of the high-performance computing cluster architecture: This diagram illustrates the HPC cluster infrastructure, where each physical node manages the optimization process of an independent VPP. Within each node, a soft computing layer executes parallel SA by using OpenMP, enabling efficient parallelization across VPP operations and ensuring optimized scheduling. This configuration enhances scalability and computation speed, supporting real-time VPP management and maximizing energy system responsiveness.</p>
Full article ">Figure 4
<p>Convergence behavior of the SA algorithm for the VPP scheduling problem, highlighting its progression through initialization, exploration, and exploitation phases to efficiently achieve optimal solutions.</p>
Full article ">Figure 5
<p>Execution time vs. number of cores for different player counts, benchmarked using 1 × 10<sup>6</sup> iterations of simulated annealing.</p>
Full article ">Figure 6
<p>Speedup ratio vs. number of cores for different player counts with ideal speedup line.</p>
Full article ">
22 pages, 6955 KiB  
Article
A Novel Multi-Dynamic Coupled Neural Mass Model of SSVEP
by Hongqi Li, Yujuan Wang and Peirong Fu
Biomimetics 2025, 10(3), 171; https://doi.org/10.3390/biomimetics10030171 - 11 Mar 2025
Viewed by 143
Abstract
Steady-state visual evoked potential (SSVEP)-based brain—computer interfaces (BCIs) leverage high-speed neural synchronization to visual flicker stimuli for efficient device control. While SSVEP-BCIs minimize user training requirements, their dependence on physical EEG recordings introduces challenges, such as inter-subject variability, signal instability, and experimental complexity. [...] Read more.
Steady-state visual evoked potential (SSVEP)-based brain—computer interfaces (BCIs) leverage high-speed neural synchronization to visual flicker stimuli for efficient device control. While SSVEP-BCIs minimize user training requirements, their dependence on physical EEG recordings introduces challenges, such as inter-subject variability, signal instability, and experimental complexity. To overcome these limitations, this study proposes a novel neural mass model for SSVEP simulation by integrating frequency response characteristics with dual-region coupling mechanisms. Specific parallel linear transformation functions were designed based on SSVEP frequency responses, and weight coefficient matrices were determined according to the frequency band energy distribution under different visual stimulation frequencies in the pre-recorded SSVEP signals. A coupled neural mass model was constructed by establishing connections between occipital and parietal regions, with parameters optimized through particle swarm optimization to accommodate individual differences and neuronal density variations. Experimental results demonstrate that the model achieved a high-precision simulation of real SSVEP signals across multiple stimulation frequencies (10 Hz, 11 Hz, and 12 Hz), with maximum errors decreasing from 2.2861 to 0.8430 as frequency increased. The effectiveness of the model was further validated through the real-time control of an Arduino car, where simulated SSVEP signals were successfully classified by the advanced FPF-net model and mapped to control commands. This research not only advances our understanding of SSVEP neural mechanisms but also releases the user from the brain-controlled coupling system, thus providing a practical framework for developing more efficient and reliable BCI-based systems. Full article
(This article belongs to the Special Issue Computational Biology Simulation, Agent-Based Modelling and AI)
Show Figures

Figure 1

Figure 1
<p>The traditional neural mass model, which contains excitatory interneurons, inhibitory interneurons, and pyramidal neurons. A sigmoid function <span class="html-italic">S</span>(<span class="html-italic">v</span>) and differential equations for excitatory (<span class="html-italic">h<sub>e</sub></span>) and inhibitory (<span class="html-italic">h<sub>i</sub></span>) responses are included to describe the dynamic behavior of the interested subpopulation. The external input <span class="html-italic">n(t)</span> is modeled as Gaussian white noise, which introduces variability to the signal, and the coupling coefficients <span class="html-italic">C</span><sub>1</sub>, <span class="html-italic">C</span><sub>2</sub>, <span class="html-italic">C</span><sub>3</sub>, and <span class="html-italic">C</span><sub>4</sub> define the interaction strengths between different neural subpopulations. The output signal <span class="html-italic">E<sup>i</sup>(t)</span>, being the difference between the excitatory and inhibitory responses, represents the EEG-like signal produced by the model.</p>
Full article ">Figure 2
<p>The multi-dynamic neural mass model for SSVEPs.</p>
Full article ">Figure 3
<p>The SSVEP-BCI multi-dynamic coupled neural mass model. The occipital and parietal regions are represented by a multi-dynamic NMM of <a href="#biomimetics-10-00171-f002" class="html-fig">Figure 2</a>, where three parallel linear transfer functions are involved in the excitatory and inhibitory interneurons. The membrane potential of each intra-regional pyramidal cell (i.e., <span class="html-italic">y<sub>out</sub></span>) is first transformed into mean spike density through the static nonlinear function <span class="html-italic">s</span>(<span class="html-italic">v</span>) and then processed by the cross-regional neural encoder.</p>
Full article ">Figure 4
<p>Simulated signal curves varying with <span class="html-italic">μ</span> and their spectral power. As <span class="html-italic">μ</span> increased from 50 to 200, the rhythmic characteristics gradually intensified, with a final pronounced spectral peak at 10 Hz.</p>
Full article ">Figure 5
<p>Simulated signal curves varying with <span class="html-italic">σ</span><sup>2</sup> when <span class="html-italic">μ</span> = 220 and their spectral power. As <span class="html-italic">σ<sup>2</sup></span> increased from 50 to 20,000, slight to progressive changes of amplitude and spectral peaks variations were observed.</p>
Full article ">Figure 6
<p>Simulated signal curves varying with <span class="html-italic">σ</span><sup>2</sup> when <span class="html-italic">μ</span> = 90 and their normalized spectral power. As <span class="html-italic">σ</span><sup>2</sup> increased, signal amplitudes gradually increased (e.g., from 100 to 3000), and even led to irregular spike activity (from 6000 or 20,000).</p>
Full article ">Figure 7
<p>The simulated signals and spectral power of the occipital region without coupling. Due to the high weight assigned to α, the waveform fluctuated around the alpha wave, and as the delta wave component increased, spike activity decreased with a gradual left-ward shift in frequency peaks.</p>
Full article ">Figure 8
<p>The simulated signals and spectra of the occipital region under unidirectional coupling. As the parietal-to-occipital coupling strength (<span class="html-italic">p<sub>o</sub></span>) increased, while maintaining zero occipital-to-parietal coupling (<span class="html-italic">o<sub>p</sub></span> = 0), the occipital region showed an increased signal amplitude while maintaining stable frequency characteristics, accompanied by enhanced spectral peak values.</p>
Full article ">Figure 9
<p>The simulated signals and spectra of the occipital region under bidirectional coupling with different dynamic characteristics. When the coupling strength between the regions increases, the occipital region model simulation signal spikes are reduced, and the spectral peaks are gradually shifted to the left.</p>
Full article ">Figure 10
<p>Comparison of real and simulated SSVEP under three types of visual stimuli, where overall waveform pattern of simulated signals remains consistent with real signals.</p>
Full article ">Figure 10 Cont.
<p>Comparison of real and simulated SSVEP under three types of visual stimuli, where overall waveform pattern of simulated signals remains consistent with real signals.</p>
Full article ">Figure 11
<p>FPF-net structure.</p>
Full article ">Figure 12
<p>Arduino car movement based on simulated SSVEP.</p>
Full article ">
26 pages, 6719 KiB  
Article
Sketch-Guided Topology Optimization with Enhanced Diversity for Innovative Structural Design
by Siyu Zhu, Jie Hu, Jin Qi, Lingyu Wang, Jing Guo, Jin Ma and Guoniu Zhu
Appl. Sci. 2025, 15(5), 2753; https://doi.org/10.3390/app15052753 - 4 Mar 2025
Viewed by 147
Abstract
Topology optimization (TO) is a powerful generative design tool for innovative structural design, capable of optimizing material distribution to generate structures with superior performance. However, current topology optimization algorithms mostly target a single objective and are highly dependent on the problem definition parameters, [...] Read more.
Topology optimization (TO) is a powerful generative design tool for innovative structural design, capable of optimizing material distribution to generate structures with superior performance. However, current topology optimization algorithms mostly target a single objective and are highly dependent on the problem definition parameters, causing two critical issues: limited human controllability and solution diversity. These issues often lead to burdensome design iterations and insufficient design exploration. This paper proposes a multi-solution TO framework to address them. Human designers express their stylistic preferences for structures through sketches which are decomposed into stroke and closed-shape elements to flexibly guide each TO process. Sketch-based constraints are integrated with Fourier mapping-based length-scale control to enhance human controllability. Solution diversity is achieved by perturbing Fourier mapping frequencies and load conditions in the neural implicit TO framework. Adaptive parallel scale adjustment is incorporated to reduce the computational cost for design exploration. Using the structural design of a wheel spoke as a case study, the mechanical performance and diversity of the generated TO solutions as well as the effectiveness of human control are analyzed both qualitatively and quantitatively. The results reveal that the sketch-based constraints and length-scale control have distinct control effects on structural features and have different impacts on the mechanical performance and diversity, thereby enabling fine-grained and flexible human controllability to better balance conflicting objectives. Full article
(This article belongs to the Special Issue Computer-Aided Design in Mechanical Engineering)
Show Figures

Figure 1

Figure 1
<p>Overview of proposed multi-solution TO framework with human controllability.</p>
Full article ">Figure 2
<p>Definition of design field, boundary, and load conditions.</p>
Full article ">Figure 3
<p>Summary of experiment design.</p>
Full article ">Figure 4
<p>Loss curves and intermediate solutions during the iterative TO process.</p>
Full article ">Figure 5
<p>Generated innovative structural designs constrained by sketch image 1.</p>
Full article ">Figure 6
<p>Generated innovative structural designs constrained by sketch image 2.</p>
Full article ">Figure 7
<p>Generated innovative structural designs constrained by sketch images 3–6.</p>
Full article ">Figure 8
<p>Illustration of 2D-to-3D conversion steps.</p>
Full article ">Figure 9
<p>Innovative design exploration results for automobile wheel spokes under sketch-based constraints. Randomly perturbed hyper-parameters include <math display="inline"><semantics> <mrow> <mi>d</mi> <mi>v</mi> </mrow> </semantics></math>, <math display="inline"><semantics> <msub> <mi>l</mi> <mo movablelimits="true" form="prefix">min</mo> </msub> </semantics></math>, and <math display="inline"><semantics> <mi mathvariant="bold-italic">f</mi> </semantics></math>. (<b>a</b>) Guided by sketch 1. (<b>b</b>) Guided by sketch 2. (<b>c</b>) Guided by sketch 3. (<b>d</b>) Guided by sketch 4.</p>
Full article ">Figure 10
<p>Experimental results of human controllability by integrating <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </semantics></math> and <math display="inline"><semantics> <mi>λ</mi> </semantics></math>.</p>
Full article ">Figure 11
<p>Ablation study of <math display="inline"><semantics> <msub> <mi>β</mi> <mrow> <mi>I</mi> <mi>n</mi> <mi>c</mi> </mrow> </msub> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mi>λ</mi> </semantics></math> (<b>b</b>), and <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </semantics></math> (<b>c</b>) in terms of compliance.</p>
Full article ">Figure 12
<p>Ablation study of <math display="inline"><semantics> <msub> <mi>β</mi> <mrow> <mi>I</mi> <mi>n</mi> <mi>c</mi> </mrow> </msub> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mi>λ</mi> </semantics></math> (<b>b</b>), and <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </semantics></math> (<b>c</b>) in terms of volume.</p>
Full article ">Figure 13
<p>Ablation study of <math display="inline"><semantics> <msub> <mi>β</mi> <mrow> <mi>I</mi> <mi>n</mi> <mi>c</mi> </mrow> </msub> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mi>λ</mi> </semantics></math> (<b>b</b>), and <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </semantics></math> (<b>c</b>) in terms of <math display="inline"><semantics> <mover> <mrow> <mi>S</mi> <mi>S</mi> <mi>I</mi> <mi>M</mi> </mrow> <mo>¯</mo> </mover> </semantics></math>.</p>
Full article ">Figure 14
<p>Ablation study of <math display="inline"><semantics> <msub> <mi>β</mi> <mrow> <mi>I</mi> <mi>n</mi> <mi>c</mi> </mrow> </msub> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mi>λ</mi> </semantics></math> (<b>b</b>), and <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> </msub> </semantics></math> (<b>c</b>) in terms of <math display="inline"><semantics> <mover> <mrow> <mi>M</mi> <mi>A</mi> <mi>E</mi> </mrow> <mo>¯</mo> </mover> </semantics></math>.</p>
Full article ">Figure 15
<p>Unreasonable generated structures due to conflicts between sketch-based constraints, volume constraint, and compliance.</p>
Full article ">Figure 16
<p>Examples of recognized regions of conflict.</p>
Full article ">
20 pages, 3815 KiB  
Article
A Benchmark for Water Surface Jet Segmentation with MobileHDC Method
by Yaojie Chen, Qing Quan, Wei Wang and Yunhan Lin
Appl. Sci. 2025, 15(5), 2755; https://doi.org/10.3390/app15052755 - 4 Mar 2025
Viewed by 162
Abstract
Intelligent jet systems are widely used in various fields, including firefighting, marine operations, and underwater exploration. Accurate extraction and prediction of jet trajectories are essential for optimizing their performance, but challenges arise due to environmental factors such as climate, wind direction, and suction [...] Read more.
Intelligent jet systems are widely used in various fields, including firefighting, marine operations, and underwater exploration. Accurate extraction and prediction of jet trajectories are essential for optimizing their performance, but challenges arise due to environmental factors such as climate, wind direction, and suction efficiency. To address these issues, we introduce two novel jet segmentation datasets, Libary and SegQinhu, which cover both indoor and outdoor environments under varying weather conditions and temporal intervals. These datasets present significant challenges, including occlusions and strong light reflections, making them ideal for evaluating jet trajectory segmentation methods. Through empirical evaluation of several state-of-the-art (SOTA) techniques on these datasets, we observe that general methods struggle with highly imbalanced pixel distributions in jet trajectory images. To overcome this, we propose a data-driven pipeline for jet trajectory extraction and segmentation. At its core is MobileHDC, a new baseline model that leverages the MobileNetV2 architecture and integrates dilated convolutions to enhance the receptive field without increasing computational cost. Additionally, we introduce a parallel convolutional block and a decoder to fuse multi-level features, enabling a better capture of contextual information and improving the continuity and accuracy of jet segmentation. The experimental results show that our method outperforms existing SOTA techniques on both jet-specific datasets, highlighting the effectiveness of our approach. Full article
Show Figures

Figure 1

Figure 1
<p>Segmentation performance of the SAM model on the Libary and SegQinhu datasets, revealing issues in segmentation where the model tends to misclassify background as jet flow.</p>
Full article ">Figure 2
<p>Relative frequency of annotated jet pixels within an image over the 1300 images in the Libary dataset (<b>a</b>) and the 823 images in the SegQinhu dataset (<b>b</b>), respectively. Here, the fraction of jet pixels serves as proxy for the size of the objects of interest within an image. (<b>a</b>) Libary, (<b>b</b>) SegQinhu.</p>
Full article ">Figure 3
<p>Sample images of jet states from the Libary dataset under various conditions, including strong lighting and minimal pixel coverage.</p>
Full article ">Figure 4
<p>Sample images of jet morphologies from the SegQinhu dataset under various conditions, including occlusion, partial coverage, and reflective scenarios.</p>
Full article ">Figure 5
<p>An overview of the basic architecture of our proposed model. Here, we set the parameters <span class="html-italic">N</span><sub>1</sub>, <span class="html-italic">N</span><sub>2</sub>, <span class="html-italic">N</span><sub>3</sub> for the repeated times as <span class="html-italic">N</span><sub>1</sub> = 6, <span class="html-italic">N</span><sub>2</sub> = 4 and <span class="html-italic">N</span><sub>3</sub> = 2. The operation ⊕ represents the concatenation operation.</p>
Full article ">Figure 6
<p>Diagram of dilated convolution. When the dilation rate is 1, it behaves identically to a standard convolution.</p>
Full article ">Figure 7
<p>Diagram of hybrid dilated convolution layers, where C and C1 represent the number of channels, with C = 160 and C1 = 256, and r = a indicates the dilation rate = a. Additionally, <math display="inline"><semantics> <msub> <mi>x</mi> <mi>s</mi> </msub> </semantics></math> represents the feature maps from the 7th layer of the MobileNetV2 network.</p>
Full article ">Figure 8
<p>Visualization of the jet segmentation results of the different methods on the Libary testing dataset.</p>
Full article ">Figure 9
<p>Visualization of the jet segmentation results of the different methods on the SegQinhu testing dataset.</p>
Full article ">
25 pages, 24262 KiB  
Article
Dynamic Load Balancing Based on Hypergraph Partitioning for Parallel Geospatial Cellular Automata Models
by Wei Xia, Qingfeng Guan, Yuanyuan Li, Hanqiu Yue, Xue Yang and Huan Gao
ISPRS Int. J. Geo-Inf. 2025, 14(3), 109; https://doi.org/10.3390/ijgi14030109 - 1 Mar 2025
Viewed by 387
Abstract
Parallel computing techniques have been adopted in geospatial cellular automata (CA) models to improve computational efficiency, enabling large-scale complex simulations of land use and land cover (LULC) changes at fine scales. However, the spatial distribution of computational intensity often changes along with the [...] Read more.
Parallel computing techniques have been adopted in geospatial cellular automata (CA) models to improve computational efficiency, enabling large-scale complex simulations of land use and land cover (LULC) changes at fine scales. However, the spatial distribution of computational intensity often changes along with the spatiotemporal dynamics of LULC during the simulation, leading to an increase in load imbalance among computing units and degradation of the computational performance of a parallel CA. This paper presents a dynamic load balancing method based on hypergraph partitioning for multi-process parallel geospatial CA models. During the simulation, the sub-domains are dynamically reassigned to computing processes through hypergraph partitioning according to the spatial variation in computational workloads to restore load balance. In addition, a novel mechanism called Migrated-SubCellspaces-First (MSCF) is proposed to reduce the cost of workload migration by employing a non-blocking communication technique to further improve computational performance. To demonstrate and evaluate the effectiveness of our method, a parallel geospatial CA model with hypergraph-based dynamic load balancing is developed. Experiments using a dataset from California showed that the proposed dynamic load balancing method achieved a computational performance enhancement of 62.59% by using 16 processes compared with a parallel CA with static load balancing. Full article
Show Figures

Figure 1

Figure 1
<p>An example of data parallelism for multilayer geospatial CA.</p>
Full article ">Figure 2
<p>Regular and irregular domain decomposition methods.</p>
Full article ">Figure 3
<p>Approaches for M-1 assignment.</p>
Full article ">Figure 4
<p>Two approaches for data I/O.</p>
Full article ">Figure 5
<p>Different types of cells and communications among computing units.</p>
Full article ">Figure 6
<p>The framework of hypergraph-based dynamic load balancing for a parallel CA.</p>
Full article ">Figure 7
<p>An example of sub-domain assignment by hypergraph partitioning; the darker color of the circle indicates higher computational intensity of the sub-domain.</p>
Full article ">Figure 8
<p>An example of an initial sub-cellspace assignment.</p>
Full article ">Figure 9
<p>Initial assignment by hypergraph partitioning. Twelve sub-cellspaces are divided into 3 groups (i.e., assignments for 3 processes). Sub-cellspaces are depicted as circles (darker color represents higher computational intensity and longer computing time), and communication costs are depicted as squares. Squares a, b, and c represent the communication costs among processes.</p>
Full article ">Figure 10
<p>(<b>a</b>) A sample of computational workloads becomes highly imbalanced at iteration k-1. (<b>b</b>) A solution of repartitioning hypergraph.</p>
Full article ">Figure 11
<p>An example of MSCF.</p>
Full article ">Figure 12
<p>Flowchart of logistic CA.</p>
Full article ">Figure 13
<p>A 50-year simulation of urban growth.</p>
Full article ">Figure 14
<p>Computing times of sLCA, pLCA, and dpLCA on different numbers of processes.</p>
Full article ">Figure 15
<p>RSD value of pLCA and dpLCA on different numbers of processes.</p>
Full article ">Figure 16
<p>Speedups of pLCA and dpLCA on different numbers of processes.</p>
Full article ">Figure 17
<p>Parallel efficiency of pLCA and dpLCA on different numbers of processes.</p>
Full article ">
25 pages, 4930 KiB  
Article
Implementation of a Data-Parallel Approach on a Lightweight Hash Function for IoT Devices
by Abdullah Sevin
Mathematics 2025, 13(5), 734; https://doi.org/10.3390/math13050734 - 24 Feb 2025
Viewed by 203
Abstract
The Internet of Things is used in many application areas in our daily lives. Ensuring the security of valuable data transmitted over the Internet is a crucial challenge. Hash functions are used in cryptographic applications such as integrity, authentication and digital signatures. Existing [...] Read more.
The Internet of Things is used in many application areas in our daily lives. Ensuring the security of valuable data transmitted over the Internet is a crucial challenge. Hash functions are used in cryptographic applications such as integrity, authentication and digital signatures. Existing lightweight hash functions leverage task parallelism but provide limited scalability. There is a need for lightweight algorithms that can efficiently utilize multi-core platforms or distributed computing environments with high degrees of parallelization. For this purpose, a data-parallel approach is applied to a lightweight hash function to achieve massively parallel software. A novel structure suitable for data-parallel architectures, inspired by basic tree construction, is designed. Furthermore, the proposed hash function is based on a lightweight block cipher and seamlessly integrated into the designed framework. The proposed hash function satisfies security requirements, exhibits high efficiency and achieves significant parallelism. Experimental results indicate that the proposed hash function performs comparably to the BLAKE implementation, with slightly slower execution for large message sizes but marginally better performance for smaller ones. Notably, it surpasses all other evaluated algorithms by at least 20%, maintaining a consistent 20% advantage over Grostl across all data sizes. Regarding parallelism, the proposed PLWHF achieves a speedup of approximately 40% when scaling from one to two threads and 55% when increasing to three threads. Raspberry Pi 4-based tests for IoT applications have also been conducted, demonstrating the hash function’s effectiveness in memory-constrained IoT environments. Statistical tests demonstrate a precision of ±0.004, validate the hypothesis in distribution tests and indicate a deviation of ±0.05 in collision tests, confirming the robustness of the proposed design. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

Figure 1
<p>Hash function in IoT application.</p>
Full article ">Figure 2
<p>Basic hash properties.</p>
Full article ">Figure 3
<p>The general structure of the proposed hash function.</p>
Full article ">Figure 4
<p>The SIMON block cipher with CTR mode.</p>
Full article ">Figure 5
<p>The sequences of the bit binary presentations under six conditions.</p>
Full article ">Figure 6
<p>Distribution of changed bit numbers.</p>
Full article ">Figure 7
<p>Statistical histogram of changed bit numbers.</p>
Full article ">Figure 8
<p>Frequency distribution of hash value.</p>
Full article ">Figure 9
<p>The variance in bit location index.</p>
Full article ">Figure 10
<p>Comparison of running times.</p>
Full article ">Figure 11
<p>Comparison of running times for various threads.</p>
Full article ">Figure 12
<p>Multi-threading performance on Raspberry Pi 4.</p>
Full article ">
17 pages, 1770 KiB  
Article
Revisiting the Mechanistic Pathway of Gas-Phase Reactions in InN MOVPE Through DFT Calculations
by Xiaokun He, Nan Xu, Yuan Xue, Hong Zhang, Ran Zuo and Qian Xu
Molecules 2025, 30(4), 971; https://doi.org/10.3390/molecules30040971 - 19 Feb 2025
Viewed by 315
Abstract
III-nitrides are crucial materials for solar flow batteries due to their versatile properties. In contrast to the well-studied MOVPE reaction mechanism for AlN and GaN, few works report gas-phase mechanistic studies on the growth of InN. To better understand the reaction thermodynamics, this [...] Read more.
III-nitrides are crucial materials for solar flow batteries due to their versatile properties. In contrast to the well-studied MOVPE reaction mechanism for AlN and GaN, few works report gas-phase mechanistic studies on the growth of InN. To better understand the reaction thermodynamics, this work revisited the gas-phase reactions involved in metal–organic vapor-phase epitaxy (abbreviated as MOVPE) growth of InN. Utilizing the M06-2X function in conjunction with Pople’s triple-ζ split-valence basis set with polarization functions, this work recharacterized all stationary points reported in previous literature and compared the differences between the structures and reaction energies. For the reaction pathways which do not include a transition state, rigorous constrained geometry optimizations were utilized to scan the PES connecting the reactants and products in adduct formation and XMIn (M, D, T) pyrolysis, confirming that there are no TSs in these pathways, which is in agreement with the previous findings. A comprehensive bonding analysis indicates that in TMIn:NH3, the In-N demonstrates strong coordinate bond characteristics, whereas in DMIn:NH3 and MMIn:NH3, the interactions between the Lewis acid and base fragments lean toward electrostatic attraction. Additionally, the NBO computations show that the H radical can facilitate the migration of electrons that are originally distributed between the In-C bonds in XMIn. Based on this finding, novel reaction pathways were also investigated. When the H radical approaches MMInNH2, MMIn:NH3 rather than MMInHNH2 will generate and this is followed by the elimination of CH4 via two parallel paths. Considering the abundance of H2 in the environment, this work also examines the reactions between H2 and XMIn. The Mulliken charge distributions indicated that intermolecular electron transfer mainly occurs between the In atom and N atom whiling forming (DMInNH2)2, whereas it predominately occurs between the In atom and the N atom intramolecularly when generating (DMInNH2)3. Full article
(This article belongs to the Section Physical Chemistry)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Two parallel paths with the elimination of CH<sub>4</sub> from MMIn:NH<sub>3</sub> and corresponding molecular structures.</p>
Full article ">Figure 2
<p>The relaxed scan for adduct formation (A1–A1b) and pyrolysis reaction (P4–P4b). [Annotation 1] The PES was explored by constrained geometry optimization, and connects the dissociated In(CH<sub>3</sub>)<sub>x−1</sub> and CH<sub>3</sub> or In(CH<sub>3</sub>)<sub>x</sub> and NH<sub>3</sub>. [Annotation 2] Relative Energy refers to the electron energy difference between the scan points and the 1st scan point (i.e., reactants).</p>
Full article ">Figure 3
<p>The ESP map of TMIn and NH<sub>3</sub>.</p>
Full article ">Figure 4
<p>The HOMO and LUMO and the associated <span class="html-italic">E</span><sub>gap</sub> of TS in reactions A1, A1a and A1b.</p>
Full article ">Figure 5
<p>The HOMO and LUMO and the <span class="html-italic">E</span><sub>gap</sub> (in eV) of TMIn, DMIn and MMIn.</p>
Full article ">Figure 6
<p>The critical bond lengths and atom distances (in Å) along with bond angles (in <sup>o</sup>) in the fully optimized TS of R9.</p>
Full article ">Figure 7
<p>The ESP map of DMInNH<sub>2</sub>.</p>
Full article ">
39 pages, 1027 KiB  
Review
State of the Art in Parallel and Distributed Systems: Emerging Trends and Challenges
by Fei Dai, Md Akbar Hossain and Yi Wang
Electronics 2025, 14(4), 677; https://doi.org/10.3390/electronics14040677 - 10 Feb 2025
Viewed by 970
Abstract
Driven by rapid advancements in interconnection, packaging, integration, and computing technologies, parallel and distributed systems have significantly evolved in recent years. These systems have become essential for addressing modern computational demands, offering enhanced processing power, scalability, and resource efficiency. This paper provides a [...] Read more.
Driven by rapid advancements in interconnection, packaging, integration, and computing technologies, parallel and distributed systems have significantly evolved in recent years. These systems have become essential for addressing modern computational demands, offering enhanced processing power, scalability, and resource efficiency. This paper provides a comprehensive overview of parallel and distributed systems, exploring their interrelationships, their key distinctions, and the emerging trends shaping their evolution. We analyse four parallel computing paradigms—heterogeneous computing, quantum computing, neuromorphic computing, and optical computing—and examine emerging distributed systems such as blockchain, serverless computing, and cloud-native architectures. The associated challenges are highlighted, and potential future directions are outlined. This work serves as a valuable resource for researchers and practitioners aiming to stay informed about trends in parallel and distributed computing while understanding the challenges and future developments in the field. Full article
(This article belongs to the Special Issue Emerging Distributed/Parallel Computing Systems)
Show Figures

Figure 1

Figure 1
<p>Logical overview of this paper’s structure. This figure illustrates the organisation of sections, their interdependencies, and the logical progression of topics in this review.</p>
Full article ">Figure 2
<p>Evolution of various computing eras. This figure outlines the evolution of computing, from single-engine serial processing to ultra-heterogeneous parallel processing, highlighting key stages in this transformation. The different colours in the squares represent various processor types utilized in each stage.</p>
Full article ">Figure 3
<p>Hardware and software layers of UHC. This figure depicts the essential software and hardware components required for UHC systems, emphasising interoperability and workload distribution.</p>
Full article ">Figure 4
<p>Qubit growth in quantum computers over recent years. This figure presents the increasing number of qubits in quantum processors, reflecting advancements in quantum computing technology.</p>
Full article ">Figure 5
<p>Overview of QML. This figure illustrates the integration of quantum computing principles in ML, showing how quantum algorithms leverage qubit- based computation. The green arrows indicate the data flow of quantum information between processing units.</p>
Full article ">Figure 6
<p>Basic structure of a blockchain block. This figure presents the fundamental components of a blockchain block, explaining how distributed ledger technology ensures security and integrity in decentralised networks.</p>
Full article ">Figure 7
<p>Key building blocks of a cloud-native architecture. This figure illustrates the four fundamental components of cloud-native systems: containers, microservices, DevOps, and CI/CD. These elements enable scalability, automation, and continuous deployment in modern cloud computing environments.</p>
Full article ">Figure 8
<p>Step-by-step illustration of federated ML. This figure explains the federated learning process, highlighting key stages such as local model training, aggregation, and privacy-preserving updates.</p>
Full article ">
26 pages, 11379 KiB  
Article
High-Performance Mobility Simulation: Implementation of a Parallel Distributed Message-Passing Algorithm for MATSim
by Janek Laudan, Paul Heinrich and Kai Nagel
Information 2025, 16(2), 116; https://doi.org/10.3390/info16020116 - 7 Feb 2025
Viewed by 494
Abstract
Striving for better simulation results, transport planners want to simulate larger domains with increased levels of detail. Achieving fast execution times for these complex traffic simulations requires the parallel computing power of modern hardware. This paper presents an architectural update to the MATSim [...] Read more.
Striving for better simulation results, transport planners want to simulate larger domains with increased levels of detail. Achieving fast execution times for these complex traffic simulations requires the parallel computing power of modern hardware. This paper presents an architectural update to the MATSim traffic simulation framework, introducing a prototype that adapts the existing traffic flow model to a distributed parallel algorithm. The prototype is capable of scaling across multiple compute nodes, utilizing the parallel computing power of modern hardware. Benchmarking reveals a 119-fold improvement in execution speed over the current implementation, and a 43 times speedup when compared to single-core performance. The prototype can simulate 24 h of large-scale traffic in just 3.5 s. Based on these results, we advocate for integrating a distributed simulation approach into MATSim and outline steps for further optimizing the prototype for large-scale applications. Full article
Show Figures

Figure 1

Figure 1
<p>Intersection in the simulated network. Vehicles queue at the end of a link and can cross an intersection once the simulation time has advanced to their earliest exit times. Additionally, the releasing link has to have sufficient flow capacity, and the receiving link sufficient storage capacity. Links crossing a computational domain boundary are divided into two parts: The downstream part manages the queue, storage-, and flow-capacities. The upstream part mirrors the storage capacity. Vehicles and capacity updates are exchanged as messages in between partitions.</p>
Full article ">Figure 2
<p>Example simulation network divided into four domains: Each network partition is managed by one computing process which receive an individual numerical rank by which they can be identified. Links crossing a domain boundary are managed by the downstream process and establish a neighbor relationship between the processes they connect. Processes only communicate with neighbor processes.</p>
Full article ">Figure 3
<p>Schema of the timings to coordinate two simulation processes for one time step: Process 1 requires more time to calculate its share of the simulation (blue arrow) than process 2. Process 2 sends its message and waits until process 1 finishes work and sends its message as well (yellow dotted arrow). Messages are transmitted over the communication hardware after both processes have called send (straight yellow arrow).</p>
Full article ">Figure 4
<p>The central section of the network used in the example runs. Links are colored by partition.</p>
Full article ">Figure 5
<p>(<b>a</b>) Real-Time Ratio of different benchmark runs. Real-Time Ratio is used because this measure is independent of the amount of simulated time, in contrast to the absolute execution time. (<b>b</b>) speedups for the same benchmark runs. The highest speedup value is achieved for the largest scenario.</p>
Full article ">Figure 6
<p>Timings for performing one time step in the simulation, distinguished by simulation work (blue shades) and message exchange (yellow shades). The timings are shown for setups with different number of processes. (<b>a</b>) Relative duration to perform one simulation time step for the 10% setup. (<b>b</b>) Absolute durations to perform one simulation time step for the 0% setup.</p>
Full article ">Figure 7
<p>Average timings measured for 30 simulation steps. Waiting times are shown in blue, maximum communication times in yellow and times to exchange messages in red. Timings are shown for different setups with 16, 64 and 1024 processes.</p>
Full article ">Figure 8
<p>Fit of a performance model for the Prototype 10% scenario in red. Predictions of possible RTR for using the fitted model: (1) Large Scenario in blue, (2) Fast Communication Hardware in yellow, reduced maximum number of neighbors in green.</p>
Full article ">Figure 9
<p>(<b>a</b>) RTR of different simulation implementations. RTR is used because this measure is independent of the amount of simulated time, in contrast to the absolute execution time. (<b>b</b>) speedups for the same simulation implementations.</p>
Full article ">
17 pages, 12434 KiB  
Article
Computational Fluid Dynamics-Based Simulation of Ventilation in a Zigzag Plastic Greenhouse
by Yu Zhang, Weizhen Sun, Longpeng Jin, Hongbing Yang, Jian Wang and Sheng Shu
Horticulturae 2025, 11(2), 175; https://doi.org/10.3390/horticulturae11020175 - 6 Feb 2025
Viewed by 420
Abstract
Zigzag plastic greenhouses are a type of greenhouse with a high natural ventilation capacity, and the number and quantities of their roof vents affect their ventilation and cooling effect. In this study, a CFD model of a greenhouse was constructed based on computational [...] Read more.
Zigzag plastic greenhouses are a type of greenhouse with a high natural ventilation capacity, and the number and quantities of their roof vents affect their ventilation and cooling effect. In this study, a CFD model of a greenhouse was constructed based on computational fluid dynamics (CFD) theory to simulate the temperature and airflow distribution of a zigzag plastic greenhouse and to investigate the effects that the number of zigzags and the construction orientation have on the cooling effect of this type of greenhouse. The results show that the average air temperature in a double zigzag plastic greenhouse (DZPG) was 0.58 °C lower than that in a single zigzag plastic greenhouse (SZPG) of the same size during the experiment. When the outdoor temperature is higher than 35 °C, the maximum temperature of the DZPG is significantly lower than that of the SZPG in a 1.5 m horizontal section; when the top vent is on the windward side, there is an obvious advantage of DZPG ventilation and the utilization efficiency of its top vent is higher, and when the top vent is on the leeward side, the distribution of the airflow in the DZPG is more intensive and more uniform. The maximum difference in the average temperature between the eight orientations of the DZPG was 0.17 °C. Therefore, the cooling effect in summer is not influenced by the construction orientation, but the airflow in the greenhouse is slightly worse when the direction of the roof vents is parallel to the prevailing wind direction. Full article
(This article belongs to the Special Issue Cultivation and Production of Greenhouse Horticulture)
Show Figures

Figure 1

Figure 1
<p>Outdoor view of a double zigzag plastic greenhouse.</p>
Full article ">Figure 2
<p>Sensor distribution map.</p>
Full article ">Figure 3
<p>Diagram of the greenhouse part of the four types of meshes.</p>
Full article ">Figure 4
<p>Element quality and corresponding percentage.</p>
Full article ">Figure 5
<p>Element independence test. (Effect of mesh quality and quantity on simulation results.)</p>
Full article ">Figure 6
<p>Regression curve between measured and simulated temperature at the measurement positions.</p>
Full article ">Figure 7
<p>Wind rose map for July to August in Nanjing.</p>
Full article ">Figure 8
<p>10,800 s DZPG temperature and air velocity distribution characteristics. (<b>a</b>) Temperature distribution of three cross-sections along the span of the DZPG; (<b>b</b>) temperature distribution of three cross-sections along the length of the DZPG; (<b>c</b>) air velocity distribution of three cross-sections along the span of the DZPG; (<b>d</b>) air velocity distribution of three sections along the length of the DZPG.</p>
Full article ">Figure 9
<p>Schematic diagram of DZPG (<b>a</b>) and SZPG (<b>b</b>).</p>
Full article ">Figure 10
<p>Comparison of simulated temperatures for DZPG and SZPG.</p>
Full article ">Figure 11
<p>Temperature distribution in the 1.5 m cultivation layer with the DZPG and SZPG vents fully open.</p>
Full article ">Figure 12
<p>Air velocity in transient analysis at 23,400 s and 32,400 s.</p>
Full article ">
18 pages, 966 KiB  
Article
Mean Field Initialization of the Annealed Importance Sampling Algorithm for an Efficient Evaluation of the Partition Function Using Restricted Boltzmann Machines
by Arnau Prat Pou, Enrique Romero, Jordi Martí and Ferran Mazzanti
Entropy 2025, 27(2), 171; https://doi.org/10.3390/e27020171 - 6 Feb 2025
Viewed by 543
Abstract
Probabilistic models in physics often require the evaluation of normalized Boltzmann factors, which in turn implies the computation of the partition function Z. Obtaining the exact value of Z, though, becomes a forbiddingly expensive task as the system size increases. A [...] Read more.
Probabilistic models in physics often require the evaluation of normalized Boltzmann factors, which in turn implies the computation of the partition function Z. Obtaining the exact value of Z, though, becomes a forbiddingly expensive task as the system size increases. A possible way to tackle this problem is to use the Annealed Importance Sampling (AIS) algorithm, which provides a tool to stochastically estimate the partition function of the system. The nature of AIS allows for an efficient and parallel implementation in Restricted Boltzmann Machines (RBMs). In this work, we evaluate the partition function of magnetic spin and spin-like systems mapped into RBMs using AIS. So far, the standard application of the AIS algorithm starts from the uniform probability distribution and uses a large number of Monte Carlo steps to obtain reliable estimations of Z following an annealing process. We show that both the quality of the estimation and the cost of the computation can be significantly improved by using a properly selected mean-field starting probability distribution. We perform a systematic analysis of AIS in both small- and large-sized problems, and compare the results to exact values in problems where these are known. As a result, we propose two successful strategies that work well in all the problems analyzed. We conclude that these are good starting points to estimate the partition function with AIS with a relatively low computational cost. The procedures presented are not linked to any learning process, and therefore do not require a priori knowledge of a training dataset. Full article
(This article belongs to the Section Statistical Physics)
Show Figures

Figure 1

Figure 1
<p>Examples of checkerboard configurations representing 1D (<b>a</b>) and 2D (<b>b</b>) magnetic spin systems. Black and white circles correspond to visible and hidden units, when mapped into RBMs.</p>
Full article ">Figure 2
<p>AIS estimation of <math display="inline"><semantics> <mrow> <mo form="prefix">log</mo> <mo>(</mo> <mi>Z</mi> <mo>)</mo> </mrow> </semantics></math> starting from <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math> for the MNIST-20h (left) and ten different GWGM sets of weights (right) as a function of the number <math display="inline"><semantics> <msub> <mi>N</mi> <mi>β</mi> </msub> </semantics></math> of intermediate distributions. The left panel shows both the exact value (in blue) and the AIS estimations, while on the right, the ratio of these two quantities is plotted.</p>
Full article ">Figure 3
<p>Percentage of AIS samples producing an estimation of <math display="inline"><semantics> <mrow> <mo form="prefix">log</mo> <mo>(</mo> <mi>Z</mi> <mo>)</mo> </mrow> </semantics></math> with a relative error of less that <math display="inline"><semantics> <mrow> <mn>5</mn> <mo>%</mo> </mrow> </semantics></math> with respect to the exact result, obtained starting from <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>. The results have been averaged over all sets of weights corresponding to the same problem.</p>
Full article ">Figure 4
<p>Percentage of AIS samples producing a relative error lower than or equal to <math display="inline"><semantics> <mrow> <mn>5</mn> <mo>%</mo> </mrow> </semantics></math> with respect to the exact <math display="inline"><semantics> <mrow> <mo form="prefix">log</mo> <mo>(</mo> <mi>Z</mi> <mo>)</mo> </mrow> </semantics></math> value, as a function of the number of hidden units and inverse temperature for a representative GWGM model. The left and right panels show the results starting from <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <msup> <mrow> <mi mathvariant="bold">B</mi> </mrow> <mo>∗</mo> </msup> </mrow> </semantics></math>, respectively.</p>
Full article ">Figure 5
<p>Relative error of all models in the transposed and non-transposed GWGM weights, computed as in Equation (<a href="#FD23-entropy-27-00171" class="html-disp-formula">23</a>). For the sake of clarity, the models have been sorted according to the relative error of the non-transposed results.</p>
Full article ">Figure 6
<p>Percentage of AIS samples with a relative error lower than <math display="inline"><semantics> <mrow> <mn>0.05</mn> <mo>%</mo> </mrow> </semantics></math> with respect to the exact <math display="inline"><semantics> <mrow> <mo form="prefix">log</mo> <mo>(</mo> <mi>Z</mi> <mo>)</mo> </mrow> </semantics></math> for the different problems analyzed. The left, middle and right bars with different gray levels correspond to the predictions starting from <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <msub> <mi mathvariant="bold">B</mi> <mi>Pinv</mi> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi mathvariant="bold">B</mi> <mo>=</mo> <msub> <mi mathvariant="bold">B</mi> <mrow> <mi>Signs</mi> <mo>_</mo> <mi mathvariant="normal">h</mi> </mrow> </msub> </mrow> </semantics></math>, respectively.</p>
Full article ">Figure 7
<p>Percentage of GWGM AIS samples with a relative error lower than or equal to <math display="inline"><semantics> <msub> <mi>ϵ</mi> <mi>r</mi> </msub> </semantics></math> with respect to the exact <math display="inline"><semantics> <mrow> <mo form="prefix">log</mo> <mo>(</mo> <mi>Z</mi> <mo>)</mo> </mrow> </semantics></math>.</p>
Full article ">Figure 8
<p>Comparison of the AIS estimation of <math display="inline"><semantics> <mrow> <mo form="prefix">log</mo> <mo>(</mo> <mi>Z</mi> <mo>)</mo> </mrow> </semantics></math> along learning for the MNIST dataset with 500 hidden units obtained starting from the different mean field probability distributions discussed in this work. The first points correspond to the first epochs, while the last ones show the predictions obtained at an intermediate stage.</p>
Full article ">
20 pages, 3107 KiB  
Article
Computer Simulation and Speedup of Solving Heat Transfer Problems of Heating and Melting Metal Particles with Laser Radiation
by Arturas Gulevskis and Konstantin Volkov
Computers 2025, 14(2), 47; https://doi.org/10.3390/computers14020047 - 4 Feb 2025
Viewed by 428
Abstract
The study of the process of laser action on powder materials requires the construction of mathematical models of the interaction of laser radiation with powder particles that take into account the features of energy supply and are applicable in a wide range of [...] Read more.
The study of the process of laser action on powder materials requires the construction of mathematical models of the interaction of laser radiation with powder particles that take into account the features of energy supply and are applicable in a wide range of beam parameters and properties of the particle material. A model of the interaction of pulsed or pulse-periodic laser radiation with a spherical metal particle is developed. To find the temperature distribution in the particle volume, the non-stationary three-dimensional heat conductivity equation with a source term that takes into account the action of laser radiation is solved. In the plane normal to the direction of propagation of laser radiation, the change in the radiation intensity obeys the Gaussian law. It is possible to take into account changes in the intensity of laser radiation in space due to its absorption by the environment. To accelerate numerical calculations, a computational algorithm is used based on the use of vectorized data structures and parallel implementation of operations on general-purpose graphics accelerators. The features of the software implementation of the method for solving a system of difference equations that arises as a result of finite-volume discretization of the heat conductivity equation with implicit scheme by the iterative method are presented. The model developed describes the heating and melting of a spherical metal particle exposed by multi-pulsed laser radiation. The implementation of the computational algorithm developed is based on the use of vectorized data structures and GPU resources. The model and calculation results are of interest for constructing a two-phase flow model describing the interaction of test particles with laser radiation on the scale of the entire calculation domain. Such a model is implemented using a discrete-trajectory approach to modeling the motion and heat exchange of a dispersed admixture. Full article
Show Figures

Figure 1

Figure 1
<p>Variation of the intensity of the laser pulse along the radius.</p>
Full article ">Figure 2
<p>Outward normals to the faces of the control volumes.</p>
Full article ">Figure 3
<p>Dividing the surface of a sphere into elements.</p>
Full article ">Figure 4
<p>Types of control volumes: polar control volumes (<b>a</b>), internal control volumes (<b>b</b>), and control volume at the center of the sphere (<b>c</b>).</p>
Full article ">Figure 5
<p>Example of addressing subsets of central cells (<b>a</b>), extended matrix of indices (<b>b</b>), and continuous numbering of cells (<b>c</b>).</p>
Full article ">Figure 6
<p>Example of addressing subsets of control volumes <math display="inline"><semantics> <mi mathvariant="bold">C</mi> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mi mathvariant="bold">E</mi> </semantics></math> (<b>b</b>), <math display="inline"><semantics> <mi mathvariant="bold">W</mi> </semantics></math> (<b>c</b>), <math display="inline"><semantics> <mi mathvariant="bold">N</mi> </semantics></math> (<b>d</b>), and <math display="inline"><semantics> <mi mathvariant="bold">S</mi> </semantics></math> (<b>e</b>).</p>
Full article ">Figure 7
<p>Example of control volume layer numbering.</p>
Full article ">Figure 8
<p>Transient temperature distributions at sphere centre for various ratios of thermal conductivities of the sphere and medium: <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>0.5</mn> <msub> <mi>λ</mi> <mn>2</mn> </msub> </mrow> </semantics></math> (1), <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mn>1</mn> </msub> <mo>=</mo> <msub> <mi>λ</mi> <mn>2</mn> </msub> </mrow> </semantics></math> (2), and <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mn>1</mn> </msub> <mo>=</mo> <mn>2</mn> <msub> <mi>λ</mi> <mn>2</mn> </msub> </mrow> </semantics></math> (3).</p>
Full article ">Figure 9
<p>Temperature change over time at points <math display="inline"><semantics> <mrow> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>R</mi> </mrow> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mrow> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>R</mi> <mo>/</mo> <mn>2</mn> </mrow> </semantics></math> (<b>b</b>), and <math display="inline"><semantics> <mrow> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math> (<b>c</b>).</p>
Full article ">Figure 10
<p>Temperature distributions along the azimuthal axis of the particle at times <math display="inline"><semantics> <mrow> <mi>t</mi> <mo>=</mo> <mn>0.2</mn> </mrow> </semantics></math> ms (<b>a</b>), 30 ms (<b>b</b>), and 140 ms (<b>c</b>).</p>
Full article ">Figure 11
<p>Thermal state of the particle over time in the azimuthal (<b>a</b>,<b>c</b>,<b>e</b>) and polar (<b>b</b>,<b>d</b>,<b>f</b>) planes at times <math display="inline"><semantics> <mrow> <mi>t</mi> <mo>=</mo> <mn>0.2</mn> </mrow> </semantics></math> ms (<b>a</b>,<b>b</b>), 30 ms (<b>c</b>,<b>d</b>), and 140 ms (<b>e</b>,<b>f</b>).</p>
Full article ">Figure 11 Cont.
<p>Thermal state of the particle over time in the azimuthal (<b>a</b>,<b>c</b>,<b>e</b>) and polar (<b>b</b>,<b>d</b>,<b>f</b>) planes at times <math display="inline"><semantics> <mrow> <mi>t</mi> <mo>=</mo> <mn>0.2</mn> </mrow> </semantics></math> ms (<b>a</b>,<b>b</b>), 30 ms (<b>c</b>,<b>d</b>), and 140 ms (<b>e</b>,<b>f</b>).</p>
Full article ">Figure 12
<p>Dependence of the calculation time on the grid sizes.</p>
Full article ">
19 pages, 3047 KiB  
Article
Development and Validation of a Rapid Tool to Measure Pragmatic Abilities: The Brief Assessment of Pragmatic Abilities and Cognitive Substrates (APACS Brief)
by Luca Bischetti, Federico Frau, Veronica Pucci, Giulia Agostoni, Chiara Pompei, Veronica Mangiaterra, Chiara Barattieri di San Pietro, Biagio Scalingi, Francesca Dall’Igna, Ninni Mangiaracina, Sara Lago, Sonia Montemurro, Sara Mondini, Marta Bosia, Giorgio Arcara and Valentina Bambini
Behav. Sci. 2025, 15(2), 107; https://doi.org/10.3390/bs15020107 - 21 Jan 2025
Viewed by 959
Abstract
Pragmatics is key to communicating effectively, and its assessment in vulnerable populations is of paramount importance. Although tools exist for this purpose, they are often effortful and time-consuming, with complex scoring procedures, which hampers their inclusion in clinical practice. To address these issues, [...] Read more.
Pragmatics is key to communicating effectively, and its assessment in vulnerable populations is of paramount importance. Although tools exist for this purpose, they are often effortful and time-consuming, with complex scoring procedures, which hampers their inclusion in clinical practice. To address these issues, we present the Brief Assessment of Pragmatic Abilities and Cognitive Substrates (APACS Brief), a rapid (10 min), easy-to-use and freely distributed tool for evaluating pragmatics in Italian, inspired by the existing APACS test and already validated in the remote version (APACS Brief Remote). The APACS Brief test measures–with a simplified scale–the domains of discourse production and figurative language understanding and is developed in two parallel forms, each including novel items differing from APACS. Psychometric properties, cut-off scores, and thresholds for change were computed on 287 adults. The analysis revealed satisfactory internal consistency, good test–retest reliability, and strong concurrent and construct validity. Moreover, APACS Brief showed excellent discriminant validity on a sample of 56 patients with schizophrenia, who were also cross-classified consistently by APACS Brief and APACS cut-off values. Overall, APACS Brief is a reliable tool for evaluating pragmatic skills and their breakdown, with brief administration time and simple scoring making it well-suited for screening in at-risk populations. Full article
(This article belongs to the Section Cognition)
Show Figures

Figure 1

Figure 1
<p>Study design and structure of the APACS Brief test. (<b>A</b>) Study design with final samples in each arm. T0 lasted approximately 25 min in the internal consistency arm, 45 min in the reliability, Alternate Form, and concurrent and discriminant validity arms, and 15 min in the in presence-remote arm. T1 lasted approximately 15 min in the reliability, Alternate Form, discriminant validity, and in presence-remote arms, and 40 min in the concurrent validity arm. The assessment included measures of vocabulary (from the Wechsler Adult Intelligence Scale–Revised, WAIS-R; <a href="#B71-behavsci-15-00107" class="html-bibr">Orsini &amp; Laicardi, 1997</a>), general cognitive (Global Examination of Mental State, GEMS; <a href="#B68-behavsci-15-00107" class="html-bibr">Mondini et al., 2022</a>), intellectual abilities (Test di Intelligenza Breve, TIB; <a href="#B39-behavsci-15-00107" class="html-bibr">Colombo et al., 2002</a>), and cognitive reserve (Cognitive Reserve Index questionnaire, CRIq; <a href="#B70-behavsci-15-00107" class="html-bibr">Nucci et al., 2012</a>). (<b>B</b>) Structure of APACS Brief with examples. See the online repository, file 2 (<a href="https://osf.io/5xevt/" target="_blank">https://osf.io/5xevt/</a>, (accessed on 1 January 2025)) for the psycholinguistic properties of the items of the APACS Brief Alternate Form.</p>
Full article ">Figure 2
<p>Visual representation of the regression coefficients for the role of demographic variables in APACS Brief and its Alternate Form total scores. The light green line corresponds to the linear term and the dark green line to the second-order polynomial term introduced in the regression analysis, plotted with their color-matching 95% confidence intervals. A position adjustment (jitter) for the observations was added for visualization purposes.</p>
Full article ">Figure 3
<p>Discriminant validity and APACS-APACS Brief cross-classification analysis. (<b>A</b>,<b>B</b>) Comparison of mean scores obtained in the APACS Brief and APACS tests individuals with schizophrenia (in light blue) and between neurotypical individuals (in green, n = 73, coming from the concurrent validity arm), with independent samples <span class="html-italic">t</span>-tests <span class="html-italic">p</span>-values; (<b>C</b>) ROC analysis discriminating between patients and controls, with AUC value; (<b>D</b>) Cross-classification of performance above and below normative cut-off in the APACS and APACS Brief tests in the sample of patients with schizophrenia.</p>
Full article ">
33 pages, 19016 KiB  
Article
Multitask Learning-Based Pipeline-Parallel Computation Offloading Architecture for Deep Face Analysis
by Faris S. Alghareb and Balqees Talal Hasan
Computers 2025, 14(1), 29; https://doi.org/10.3390/computers14010029 - 20 Jan 2025
Viewed by 1239
Abstract
Deep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring custom [...] Read more.
Deep Neural Networks (DNNs) have been widely adopted in several advanced artificial intelligence applications due to their competitive accuracy to the human brain. Nevertheless, the superior accuracy of a DNN is achieved at the expense of intensive computations and storage complexity, requiring custom expandable hardware, i.e., graphics processing units (GPUs). Interestingly, leveraging the synergy of parallelism and edge computing can significantly improve CPU-based hardware platforms. Therefore, this manuscript explores levels of parallelism techniques along with edge computation offloading to develop an innovative hardware platform that improves the efficacy of deep learning computing architectures. Furthermore, the multitask learning (MTL) approach is employed to construct a parallel multi-task classification network. These tasks include face detection and recognition, age estimation, gender recognition, smile detection, and hair color and style classification. Additionally, both pipeline and parallel processing techniques are utilized to expedite complicated computations, boosting the overall performance of the presented deep face analysis architecture. A computation offloading approach, on the other hand, is leveraged to distribute computation-intensive tasks to the server edge, whereas lightweight computations are offloaded to edge devices, i.e., Raspberry Pi 4. To train the proposed deep face analysis network architecture, two custom datasets (HDDB and FRAED) were created for head detection and face-age recognition. Extensive experimental results demonstrate the efficacy of the proposed pipeline-parallel architecture in terms of execution time. It requires 8.2 s to provide detailed face detection and analysis for an individual and 23.59 s for an inference containing 10 individuals. Moreover, a speedup of 62.48% is achieved compared to the sequential-based edge computing architecture. Meanwhile, 25.96% speed performance acceleration is realized when implementing the proposed pipeline-parallel architecture only on the server edge compared to the sever sequential implementation. Considering classification efficiency, the proposed classification modules achieve an accuracy of 88.55% for hair color and style classification and a remarkable prediction outcome of 100% for face recognition and age estimation. To summarize, the proposed approach can assist in reducing the required execution time and memory capacity by processing all facial tasks simultaneously on a single deep neural network rather than building a CNN model for each task. Therefore, the presented pipeline-parallel architecture can be a cost-effective framework for real-time computer vision applications implemented on resource-limited devices. Full article
Show Figures

Figure 1

Figure 1
<p>Head detection dataset versus face detection dataset using nano-based YOLOv8.</p>
Full article ">Figure 2
<p>Sample images from the hair dataset used to train the hair color-style module.</p>
Full article ">Figure 3
<p>Selected image samples of the created face recognition and age estimation dataset.</p>
Full article ">Figure 4
<p>The general framework of the proposed deep face analysis architecture.</p>
Full article ">Figure 5
<p>Stages of the pipeline-multithreading architecture, showing four images being processed in parallel.</p>
Full article ">Figure 6
<p>Proposed pipeline-parallel architectures with thread distributions; (<b>a</b>) multithreading three MTL-based classifiers on a single edge device, (<b>b</b>) multithreading three MTL-based classifiers on a cluster containing three edge computing devices.</p>
Full article ">Figure 7
<p>Modified VGG-Face network to support the multitask classification approach.</p>
Full article ">Figure 8
<p>Offloading feature maps of detected heads to edge devices using multithreading.</p>
Full article ">Figure 9
<p>Multithreading of parallel modules on edge server and edge node processors.</p>
Full article ">Figure 10
<p>The framework of system deployment for the proposed deep face analysis.</p>
Full article ">Figure 11
<p>Training and validation performance of the YOLOv8 model for head detection. The x-axis represents the number of epochs.</p>
Full article ">Figure 12
<p>YOLOv8 testing performance; (<b>a</b>) confusion matrix, (<b>b</b>) precision, (<b>c</b>) recall, (<b>d</b>) precision-recall, and (<b>e</b>) F1 score confidence curve.</p>
Full article ">Figure 13
<p>Head detection result samples of YOLOv8, where a red box denotes a detected head with its corresponding confidence level.</p>
Full article ">Figure 14
<p>Confusion matrices for classification modules using STL and MTL; (<b>a</b>) hair color STL, (<b>b</b>) hair color MTL, (<b>c</b>) hairstyle STL, (<b>d</b>) hairstyle MTL, (<b>e</b>) gender STL, (<b>f</b>) gender MTL, (<b>g</b>) smile STL, (<b>h</b>) smile MTL, (<b>i</b>) Face STL, (<b>j</b>) Face MTL, (<b>k</b>) age STL, and (<b>l</b>) age MTL module.</p>
Full article ">Figure 15
<p>Speed performance evaluation of the proposed pipeline-parallel architecture; (<b>a</b>) execution time for pipeline-parallel configurations versus sequential implementation, (<b>b</b>) speedup comparisons of implemented configurations.</p>
Full article ">
20 pages, 7483 KiB  
Article
An Enhanced LiDAR-Based SLAM Framework: Improving NDT Odometry with Efficient Feature Extraction and Loop Closure Detection
by Yan Ren, Zhendong Shen, Wanquan Liu and Xinyu Chen
Processes 2025, 13(1), 272; https://doi.org/10.3390/pr13010272 - 19 Jan 2025
Viewed by 979
Abstract
Simultaneous localization and mapping (SLAM) is crucial for autonomous driving, drone navigation, and robot localization, relying on efficient point cloud registration and loop closure detection. Traditional Normal Distributions Transform (NDT) odometry frameworks provide robust solutions but struggle with real-time performance due to the [...] Read more.
Simultaneous localization and mapping (SLAM) is crucial for autonomous driving, drone navigation, and robot localization, relying on efficient point cloud registration and loop closure detection. Traditional Normal Distributions Transform (NDT) odometry frameworks provide robust solutions but struggle with real-time performance due to the high computational complexity of processing large-scale point clouds. This paper introduces an improved NDT-based LiDAR odometry framework to address these challenges. The proposed method enhances computational efficiency and registration accuracy by introducing a unified feature point cloud framework that integrates planar and edge features, enabling more accurate and efficient inter-frame matching. To further improve loop closure detection, a parallel hybrid approach combining Radius Search and Scan Context is developed, which significantly enhances robustness and accuracy. Additionally, feature-based point cloud registration is seamlessly integrated with full cloud mapping in global optimization, ensuring high-precision pose estimation and detailed environmental reconstruction. Experiments on both public datasets and real-world environments validate the effectiveness of the proposed framework. Compared with traditional NDT, our method achieves trajectory estimation accuracy increases of 35.59% and over 35%, respectively, with and without loop detection. The average registration time is reduced by 66.7%, memory usage is decreased by 23.16%, and CPU usage drops by 19.25%. These results surpass those of existing SLAM systems, such as LOAM. The proposed method demonstrates superior robustness, enabling reliable pose estimation and map construction in dynamic, complex settings. Full article
(This article belongs to the Section Manufacturing Processes and Systems)
Show Figures

Figure 1

Figure 1
<p>The system structure.</p>
Full article ">Figure 2
<p>Combined feature point cloud. (<b>a</b>) is the raw point cloud acquired by LiDAR, and (<b>b</b>) is the feature point cloud. The feature point cloud is composed of planar points, edge points, and ground points; the outlier points and small-scale points in the environment are removed; and only large-scale point clouds are retained. Compared to the original point cloud, the feature point cloud significantly reduces the number of points while effectively preserving environmental features.</p>
Full article ">Figure 3
<p>(<b>a</b>) KITTI data acquisition platform, equipped with an inertial navigation system (GPS/IMU) OXTS RT 3003, a Velodyne HDL-64E LiDAR, two 1.4 MP grayscale cameras, two 1.4 MP color cameras, and four zoom lenses. (<b>b</b>) Sensor installation positions on the platform.</p>
Full article ">Figure 4
<p>Comparison of trajectories across different algorithm frameworks for Sequence 00-10. The trajectories generated during mapping for LOAM, LeGO-LOAM, DLO, the original NDT, and our method are compared.</p>
Full article ">Figure 5
<p>Loop closure detection results for various methods on Sequence 09. It can be seen that our improved method effectively identifies the loop closure. The parallel strategy using two loop closure detection methods greatly improves detection accuracy.</p>
Full article ">Figure 6
<p>(<b>a</b>–<b>c</b>) Inter-frame registration time, memory usage, and CPU usage before and after the improvement. Our improved method effectively reduces matching time and computational load.</p>
Full article ">Figure 7
<p>Mobile robot platform.</p>
Full article ">Figure 8
<p>Maps generated using the improved method. (<b>a</b>–<b>d</b>) The one-way corridor, round-trip corridor, loop corridor, and long, feature-sparse corridor, respectively.</p>
Full article ">Figure 9
<p>(<b>a</b>–<b>d</b>) Maps generated by the original method. Significant mapping errors occurred in larger environments, such as (<b>c</b>,<b>d</b>).</p>
Full article ">Figure 9 Cont.
<p>(<b>a</b>–<b>d</b>) Maps generated by the original method. Significant mapping errors occurred in larger environments, such as (<b>c</b>,<b>d</b>).</p>
Full article ">Figure 10
<p>Detailed comparison between the improved and original methods. (<b>a</b>,<b>b</b>) The improved and original methods, respectively. The improved method balances detail preservation and computation speed, while the original sacrifices some environmental accuracy for mapping results.</p>
Full article ">Figure 11
<p>Map comparison. (<b>a</b>) The Google Earth image. (<b>b</b>) LeGO-LOAM failed to close the loop due to the lack of IMU data, leading to Z-axis drift. (<b>c</b>) The original NDT framework experienced significant drift in large-scale complex environments. (<b>d</b>) The improved method produced maps closely matching the real environment.</p>
Full article ">Figure 12
<p>Detail of Scenario 2. The improved method preserved environmental details without artifacts or mismatches.</p>
Full article ">Figure 13
<p>(<b>a</b>–<b>c</b>) Scenario 2 map comparison. (<b>b</b>) The map generated by the original NDT method lacked details. (<b>c</b>) The improved method effectively preserved details.</p>
Full article ">
Back to TopTop