[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,186)

Search Parameters:
Keywords = feature extraction/matching

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 2997 KiB  
Article
BEV Semantic Map Reconstruction for Self-Driving Cars with the Multi-Head Attention Mechanism
by Yi-Cheng Liao, Jichiang Tsai and Hsuan-Ying Chien
Electronics 2025, 14(1), 32; https://doi.org/10.3390/electronics14010032 (registering DOI) - 25 Dec 2024
Abstract
Environmental perception is crucial for safe autonomous driving, enabling accurate analysis of the vehicle’s surroundings. While 3D LiDAR is traditionally used for 3D environment reconstruction, its high cost and complexity present challenges. In contrast, camera-based cross-view frameworks can offer a cost-effective alternative. Hence, [...] Read more.
Environmental perception is crucial for safe autonomous driving, enabling accurate analysis of the vehicle’s surroundings. While 3D LiDAR is traditionally used for 3D environment reconstruction, its high cost and complexity present challenges. In contrast, camera-based cross-view frameworks can offer a cost-effective alternative. Hence, this manuscript proposes a new cross-view model to extract mapping features from camera images and then transfer them to a Bird’s-Eye View (BEV) map. Particularly, a multi-head attention mechanism in the decoder architecture generates the final semantic map. Each camera learns embedding information corresponding to its position and angle within the BEV map. Cross-view attention fuses information from different perspectives to predict top-down map features enriched with spatial information. The multi-head attention mechanism then globally performs dependency matches, enhancing long-range information and capturing latent relationships between features. Transposed convolution replaces traditional upsampling methods, avoiding high similarities of local features and facilitating semantic segmentation inference of the BEV map. Finally, we conduct numerous simulation experiments to verify the performance of our cross-view model. Full article
(This article belongs to the Special Issue Advancement on Smart Vehicles and Smart Travel)
18 pages, 68585 KiB  
Article
A Registration Method Based on Ordered Point Clouds for Key Components of Trains
by Kai Yang, Xiaopeng Deng, Zijian Bai, Yingying Wan, Liming Xie and Ni Zeng
Sensors 2024, 24(24), 8146; https://doi.org/10.3390/s24248146 - 20 Dec 2024
Viewed by 253
Abstract
Point cloud registration is pivotal across various applications, yet traditional methods rely on unordered point clouds, leading to significant challenges in terms of computational complexity and feature richness. These methods often use k-nearest neighbors (KNN) or neighborhood ball queries to access local neighborhood [...] Read more.
Point cloud registration is pivotal across various applications, yet traditional methods rely on unordered point clouds, leading to significant challenges in terms of computational complexity and feature richness. These methods often use k-nearest neighbors (KNN) or neighborhood ball queries to access local neighborhood information, which is not only computationally intensive but also confines the analysis within the object’s boundary, making it difficult to determine if points are precisely on the boundary using local features alone. This indicates a lack of sufficient local feature richness. In this paper, we propose a novel registration strategy utilizing ordered point clouds, which are now obtainable through advanced depth cameras, 3D sensors, and structured light-based 3D reconstruction. Our approach eliminates the need for computationally expensive KNN queries by leveraging the inherent ordering of points, significantly reducing processing time; extracts local features by utilizing 2D coordinates, providing richer features compared to traditional methods, which are constrained by object boundaries; compares feature similarity between two point clouds without keypoint extraction, enhancing efficiency and accuracy; and integrates image feature-matching techniques, leveraging the coordinate correspondence between 2D images and 3D-ordered point clouds. Experiments on both synthetic and real-world datasets, including indoor and industrial environments, demonstrate that our algorithm achieves an optimal balance between registration accuracy and efficiency, with registration times consistently under one second. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>The difference between the KNN query method and our approach. The KNN query method constrains neighboring points to fall within the object boundary, whereas our approach allows neighboring points to extend beyond the object boundary. The blue and yellow point clouds denote the source and target, respectively, with green indicating the local neighborhood points and the red line segment denoting the matching point. Our method can accurately match corresponding points when only utilizing local features.</p>
Full article ">Figure 2
<p>The distance and angular relationships between neighboring points and the central point. The blue represents the central point, while the yellow represents neighboring points. The multi-scale feature extraction layer consists of three layers, but for the sake of simplicity, we illustrate two layers.</p>
Full article ">Figure 3
<p>Fusion of semantic and geometric features. The pretrained SuperGlue network is employed to extract keypoints, which are expanded in both 2D images and 3D point clouds to generate more keypoints, as highlighted by the red rectangle in the second column. Semantic features are learned by ResNet with gradient stopping applied, as depicted in the top section. Geometric features are computed using neighboring points within a local window, as outlined in Algorithm 1, and learned by a multi-layer perceptron (MLP), as depicted in the bottom section. Finally, these semantic and geometric features are concatenated and further processed by an MLP to achieve feature fusion.</p>
Full article ">Figure 4
<p>The comparison results of our method against the ICP algorithm and the Harris + FPFH algorithm. Our method exhibits superior registration performance across all datasets compared to other approaches, achieving results that closely align with the ground truth. The green color denotes the source point cloud before and after registration, while the red color indicates the target point cloud.</p>
Full article ">Figure 5
<p>The registration performance of the point cloud acquired from the 3D camera. Our approach demonstrates superior registration effectiveness in comparison to alternative methodologies. The green color denotes the source point cloud before and after registration, while the red color indicates the target point cloud.</p>
Full article ">Figure 6
<p>The registration results for scenes containing non-rigid objects. The green color denotes the source point cloud before and after registration, while the red color indicates the target point cloud.</p>
Full article ">Figure 7
<p>The registration results for scenes with extensive planar areas and increased noise levels. The green color denotes the source point cloud before and after registration, while the red color indicates the target point cloud.</p>
Full article ">Figure 8
<p>The extracted matching points. The lines in red represent the correctly matched points, whereas the incorrectly matched points are depicted in green. The label “With Plane” indicates that points on a plane are not removed, while the label “Without Plane” signifies that points on a plane have been removed.</p>
Full article ">Figure 9
<p>The matching points and registration results. The left section illustrates the matching and registration results obtained with the SIFT algorithm, while the right section depicts results obtained using a combination of the SIFT algorithm for keypoint extraction and Algorithm 1 for descriptor computation. The top section displays both correctly and incorrectly matched points, where red lines indicate correctly matched points and green lines indicate incorrectly matched points. The bottom section presents the registration results.</p>
Full article ">Figure 10
<p>Registration results for semantic-geometric features. The red points represent the target point cloud, while the green points depict the source point cloud.</p>
Full article ">
19 pages, 2207 KiB  
Article
Exploring Voice Acoustic Features Associated with Cognitive Status in Korean Speakers: A Preliminary Machine Learning Study
by Jiho Lee, Nayeon Kim, Ji-Wan Ha, Kyunghun Kang, Eunhee Park, Janghyeok Yoon and Ki-Su Park
Diagnostics 2024, 14(24), 2837; https://doi.org/10.3390/diagnostics14242837 - 17 Dec 2024
Viewed by 302
Abstract
Objective: To develop a non-invasive cognitive impairment detection system using speech data analysis, addressing the growing global dementia crisis and enabling accessible early screening through daily health monitoring. Methods: Speech data from 223 Korean patients were collected across eight tasks. Patients [...] Read more.
Objective: To develop a non-invasive cognitive impairment detection system using speech data analysis, addressing the growing global dementia crisis and enabling accessible early screening through daily health monitoring. Methods: Speech data from 223 Korean patients were collected across eight tasks. Patients were classified based on Korean Mini-Mental State Examination scores. Four machine learning models were tested for three binary classification tasks. Voice acoustic features were extracted and analyzed. Results: The Deep Neural Network model performed best in two classification tasks, with Precision-Recall Area Under the Curve scores of 0.737 for severe vs. no impairment and 0.726 for mild vs. no impairment, while Random Forest achieved 0.715 for severe + mild vs. no impairment. Several acoustic features emerged as potentially important indicators, with DDA shimmer from the /i/ task and stdevF0 from the /puh-tuh-kuh/ task showing consistent patterns across classification tasks. Conclusions: This preliminary study suggests that certain acoustic features may be associated with cognitive status, though demographic factors significantly influence these relationships. Further research with demographically matched populations is needed to validate these findings. Full article
(This article belongs to the Special Issue A New Era in Diagnosis: From Biomarkers to Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Overview of the proposed approach.</p>
Full article ">Figure 2
<p>Environmental setup for speech data collection.</p>
Full article ">Figure 3
<p>Summary of the five-fold cross-validation process.</p>
Full article ">Figure 4
<p>SHAP values for the severe vs. normal classification task.</p>
Full article ">Figure 5
<p>SHAP values for the mild vs. normal classification task.</p>
Full article ">Figure 6
<p>SHAP values for the severe + mild vs. normal classification task.</p>
Full article ">
29 pages, 96249 KiB  
Article
SAR-MINF: A Novel SAR Image Descriptor and Matching Method for Large-Scale Multidegree Overlapping Tie Point Automatic Extraction
by Shuo Li, Xiongwen Yang, Xiaolei Lv and Jian Li
Remote Sens. 2024, 16(24), 4696; https://doi.org/10.3390/rs16244696 - 16 Dec 2024
Viewed by 374
Abstract
The automatic extraction of large-scale tie points (TPs) for Synthetic Aperture Radar (SAR) images is crucial for generating SAR Digital Orthophoto Maps (DOMs). This task involves matching SAR images under various conditions, such as different resolutions, incidence angles, and orbital directions, which is [...] Read more.
The automatic extraction of large-scale tie points (TPs) for Synthetic Aperture Radar (SAR) images is crucial for generating SAR Digital Orthophoto Maps (DOMs). This task involves matching SAR images under various conditions, such as different resolutions, incidence angles, and orbital directions, which is highly challenging. To address the feature extraction challenges of different SAR images, we propose a Gamma Modulated Phase Congruency (GMPC) model. This improved phase congruency model is defined by a Gamma Modulation Filter (GMF) and an adaptive noise model. Additionally, to reduce layover interference in SAR images, we introduce a GMPC-Harris feature point extraction method with layover perception. We also propose a matching method based on the SAR Modality Independent Neighborhood Fusion (SAR-MINF) descriptor, which fuses feature information from different neighborhoods. Furthermore, we present a graph-based overlap extraction algorithm and establish an automated workflow for large-scale TP extraction. Experiments show that the proposed SAR-MINF matching method increases the Correct Match Rate (CMR) by an average of 31.2% and the matching accuracy by an average of 57.8% compared with other prevalent SAR image matching algorithms. The proposed TP extraction algorithm can extract full-degree TPs with an accuracy of less than 0.5 pixels for more than 98% of 2-degree TPs and over 95% of multidegree TPs, meeting the requirements of DOM production. Full article
Show Figures

Figure 1

Figure 1
<p>The Fourier series expansions for (<b>a</b>) square waves and (<b>b</b>) triangular waves are depicted, where the black solid line represents the original function, the blue solid line represents the sum of the first four terms of the Fourier series expansion, and the dashed lines represent the individual Fourier series expansion terms.</p>
Full article ">Figure 2
<p>The two-dimensional Gamma Modulation Filter consists of (<b>a</b>) the Gamma kernel function part, (<b>b</b>) the odd part <math display="inline"><semantics> <mrow> <mi>G</mi> <mi>M</mi> <msub> <mi>F</mi> <mi>o</mi> </msub> </mrow> </semantics></math>, and (<b>c</b>) the even part <math display="inline"><semantics> <mrow> <mi>G</mi> <mi>M</mi> <msub> <mi>F</mi> <mi>e</mi> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 3
<p>(<b>a</b>) Simulated SAR images with multiplicative noise and the definitions of (<b>b</b>) <math display="inline"><semantics> <msub> <mi>o</mi> <mrow> <mi>s</mi> <mi>a</mi> <mi>r</mi> </mrow> </msub> </semantics></math>, (<b>c</b>) <math display="inline"><semantics> <msub> <mi>e</mi> <mrow> <mi>s</mi> <mi>a</mi> <mi>r</mi> </mrow> </msub> </semantics></math>, and (<b>d</b>) SAR local energy <math display="inline"><semantics> <msub> <mi>E</mi> <mrow> <mi>s</mi> <mi>a</mi> <mi>r</mi> </mrow> </msub> </semantics></math> with two-dimensional Gamma Modulation Filter at <math display="inline"><semantics> <mrow> <mi>θ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 4
<p>Comparison of (<b>b</b>) GMPC, (<b>c</b>) PC, and (<b>d</b>) SAR-PC on (<b>a</b>) real SAR images.</p>
Full article ">Figure 5
<p>SAR-MIND Matching Process.</p>
Full article ">Figure 6
<p>Comparison of different keypoint extraction algorithms: (<b>a</b>) GMPC-Harris, (<b>b</b>) SAR-Harris, (<b>c</b>) Harris, (<b>d</b>) SURF, (<b>e</b>) FAST.</p>
Full article ">Figure 7
<p>(<b>a</b>) The layover area in SAR image and (<b>b</b>) its geometric relationships.</p>
Full article ">Figure 8
<p>Comparison of <math display="inline"><semantics> <msub> <mi>H</mi> <mrow> <mi>G</mi> <mi>M</mi> <mi>P</mi> <mi>C</mi> </mrow> </msub> </semantics></math> between layover and normal area. (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) is the GMPC orientation histogram of the yellow region in (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>), respectively.</p>
Full article ">Figure 9
<p>GMPC-Harris keypoints based on Layover perception. (<b>a</b>) Mountain area. (<b>b</b>) Building area.</p>
Full article ">Figure 10
<p>Sampling space for different feature descriptors: (<b>a</b>) DAISY, (<b>b</b>) GLOH, (<b>c</b>) SAR-MINF.</p>
Full article ">Figure 11
<p>The construction process of the SAR-MINF descriptor.</p>
Full article ">Figure 12
<p>Flowchart of large-scale tie point automatic extraction.</p>
Full article ">Figure 13
<p>Experimental images for SAR-MINF matching. (<b>a</b>–<b>h</b>) Pair A–H.</p>
Full article ">Figure 14
<p>(<b>a</b>) CMR and (<b>b</b>) RMSE of different algorithms.</p>
Full article ">Figure 15
<p>Data from five experiments of RIFT with pair G.</p>
Full article ">Figure 16
<p>Matching results of Pair F: (<b>a</b>) SAR-MINF, (<b>b</b>) NCC, (<b>c</b>) SAR-PC, (<b>d</b>) SAR-SIFT, (<b>e</b>) KAZE-SAR, (<b>f</b>) RIFT.</p>
Full article ">Figure 17
<p>Enlarged checkboard mosaic sub-images of pair F: (<b>a</b>) SAR-MINF, (<b>b</b>) NCC, (<b>c</b>) SAR-PC, (<b>d</b>) SAR-SIFT, (<b>e</b>) KAZE-SAR, (<b>f</b>) RIFT.</p>
Full article ">Figure 18
<p>Checkboard mosaic images and enlarged sub-images of all images under the SAR-MINF algorithm. (<b>a</b>–<b>h</b>) Pair A–H.</p>
Full article ">Figure 19
<p>(<b>a</b>) CMR and (<b>b</b>) RMSE of the varying search radius.</p>
Full article ">Figure 20
<p>Ablation experiment for neighborhood fusion. (<b>a</b>) CMR. (<b>b</b>) RMSE.</p>
Full article ">Figure 21
<p>Geographical distribution of (<b>a</b>) the first and (<b>b</b>) the second set of TPs test data.</p>
Full article ">Figure 22
<p>The multidegree overlapping graph of the first set of images.</p>
Full article ">Figure 23
<p>Histogram of RMSE distribution for TPs. (<b>a</b>) First group of 2-degree TPs. (<b>b</b>) First group of multi-degree TPs. (<b>c</b>) Second group of 2-degree TPs. (<b>d</b>) Second group of multi-degree TPs.</p>
Full article ">Figure 24
<p>Partial multidegree overlapping TPs’ slices. (<b>a</b>–<b>c</b>) 3-degree overlapping TPs, (<b>d</b>–<b>f</b>) 4-degree overlapping TPs, (<b>g</b>–<b>i</b>) 5-degree overlapping TPs.</p>
Full article ">
22 pages, 7963 KiB  
Article
WTSM-SiameseNet: A Wood-Texture-Similarity-Matching Method Based on Siamese Networks
by Yizhuo Zhang, Guanlei Wu, Shen Shi and Huiling Yu
Information 2024, 15(12), 808; https://doi.org/10.3390/info15120808 - 16 Dec 2024
Viewed by 313
Abstract
In tasks such as wood defect repair and the production of high-end wooden furniture, ensuring the consistency of the texture in repaired or jointed areas is crucial. This paper proposes the WTSM-SiameseNet model for wood-texture-similarity matching and introduces several improvements to address the [...] Read more.
In tasks such as wood defect repair and the production of high-end wooden furniture, ensuring the consistency of the texture in repaired or jointed areas is crucial. This paper proposes the WTSM-SiameseNet model for wood-texture-similarity matching and introduces several improvements to address the issues present in traditional methods. First, to address the issue that fixed receptive fields cannot adapt to textures of different sizes, a multi-receptive field fusion feature extraction network was designed. This allows the model to autonomously select the optimal receptive field, enhancing its flexibility and accuracy when handling wood textures at different scales. Secondly, the interdependencies between layers in traditional serial attention mechanisms limit performance. To address this, a concurrent attention mechanism was designed, which reduces interlayer interference by using a dual-stream parallel structure that enhances the ability to capture features. Furthermore, to overcome the issues of existing feature fusion methods that disrupt spatial structure and lack interpretability, this study proposes a feature fusion method based on feature correlation. This approach not only preserves the spatial structure of texture features but also improves the interpretability and stability of the fused features and the model. Finally, by introducing depthwise separable convolutions, the issue of a large number of model parameters is addressed, significantly improving training efficiency while maintaining model performance. Experiments were conducted using a wood texture similarity dataset consisting of 7588 image pairs. The results show that WTSM-SiameseNet achieved an accuracy of 96.67% on the test set, representing a 12.91% improvement in accuracy and a 14.21% improvement in precision compared to the pre-improved SiameseNet. Compared to CS-SiameseNet, accuracy increased by 2.86%, and precision improved by 6.58%. Full article
Show Figures

Figure 1

Figure 1
<p>Diagram of the SiameseNet architecture.</p>
Full article ">Figure 2
<p>Diagram of the WTSM-SiameseNet architecture.</p>
Full article ">Figure 3
<p>Diagram of the MRF-Resnet architecture.</p>
Full article ">Figure 4
<p>Multi-scale receptive field fusion.</p>
Full article ">Figure 5
<p>Concurrent attention.</p>
Full article ">Figure 6
<p>CBAM attention.</p>
Full article ">Figure 7
<p>Texture feature aggregation and matching module.</p>
Full article ">Figure 8
<p>Sample dataset.</p>
Full article ">Figure 9
<p>Training loss.</p>
Full article ">Figure 10
<p>Wood-texture-similarity matching example.</p>
Full article ">
21 pages, 54945 KiB  
Article
Efficient Registration of Airborne LiDAR and Terrestrial LiDAR Point Clouds in Forest Scenes Based on Single-Tree Position Consistency
by Xiaolong Cheng, Xinyu Liu, Yuemei Huang, Wei Zhou and Jie Nie
Forests 2024, 15(12), 2185; https://doi.org/10.3390/f15122185 - 12 Dec 2024
Viewed by 421
Abstract
Airborne LiDAR (ALS) and terrestrial LiDAR (TLS) data integration provides complementary perspectives for acquiring detailed 3D forest information. However, challenges in registration arise due to feature instability, low overlap, and differences in cross-platform point cloud density. To address these issues, this study proposes [...] Read more.
Airborne LiDAR (ALS) and terrestrial LiDAR (TLS) data integration provides complementary perspectives for acquiring detailed 3D forest information. However, challenges in registration arise due to feature instability, low overlap, and differences in cross-platform point cloud density. To address these issues, this study proposes an automatic point cloud registration method based on the consistency of the single-tree position distribution in multi-species and complex forest scenes. In this method, single-tree positions are extracted as feature points using the Stepwise Multi-Form Fitting (SMF) technique. A novel feature point matching method is proposed by constructing a polar coordinate system, which achieves fast horizontal registration. Then, the Z-axis translation is determined through the integration of Cloth Simulation Filtering (CSF) and grid-based methods. Finally, the Iterative Closest Point (ICP) algorithm is employed to perform fine registration. The experimental results demonstrate that the method achieves high registration accuracy across four forest plots of varying complexity, with root-mean-square errors of 0.0423 m, 0.0348 m, 0.0313 m, and 0.0531 m. The registration accuracy is significantly improved compared to existing methods, and the time efficiency is enhanced by an average of 90%. This method offers robust and accurate registration performance in complex and diverse forest environments. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>Visualization of the original point clouds for the four sample plots: (<b>a</b>) ALS point cloud data; (<b>b</b>) TLS point cloud data.</p>
Full article ">Figure 2
<p>Flowchart of the point cloud registration method.</p>
Full article ">Figure 3
<p>Visualization of the results of fitting the trunk and crown to a single tree.</p>
Full article ">Figure 4
<p>Single-tree splitting effect: (<b>a</b>) single-trees that were successfully fitted; (<b>b</b>) single-trees where trunk features could not be detected; (<b>c</b>) multiple trees incorrectly identified as a single-tree; (<b>d</b>) branches of a single-tree incorrectly identified as a separate tree.</p>
Full article ">Figure 5
<p>Construction of a polar coordinate system: (<b>a</b>) based on single-tree feature points of TLS; (<b>b</b>) based on single-tree feature points of ALS.</p>
Full article ">Figure 6
<p>Polar coordinate system matching: (<b>a</b>) two point cloud distributions, where red points represent TLS point cloud data and blue points represent ALS point cloud data; (<b>b</b>) unsuccessful polar coordinate matching; (<b>c</b>) successful polar coordinate matching.</p>
Full article ">Figure 7
<p>Visualization of trunk fitting results: (<b>a</b>) fitting of a regular tree trunk; (<b>b</b>) fitting of a single-tree trunk with a low branching point; (<b>c</b>) fitting of a trunk section with short branches; (<b>d</b>) fitting of a trunk section with long branches; (<b>e</b>) fitting of an irregular tree trunk; (<b>f</b>) fitting of an irregular tree trunk with lower branches.</p>
Full article ">Figure 8
<p>Visualization of crown fitting results: (<b>a</b>) fitting of a regular tree crown; (<b>b</b>) fitting of a regular tree crown with a non-prominent apex; (<b>c</b>) fitting of a tree crown from dense coexisting TLS point clouds; (<b>d</b>) fitting of a tree crown from sparse coexisting TLS point clouds; (<b>e</b>) fitting of a tree crown from coexisting ALS point clouds; (<b>f</b>) fitting of a tree crown with non-prominent features.</p>
Full article ">Figure 9
<p>Feature point extraction results: (<b>a</b>) TLS point cloud data; (<b>b</b>) ALS point cloud data.</p>
Full article ">Figure 10
<p>Visualization of the coarse registration process for the four plots: ALS point cloud data are shown in red and TLS point cloud data are in black.</p>
Full article ">Figure 11
<p>Visualization of registration results and details: ALS point cloud data are shown in red and TLS point cloud data are in black. Each plot in the sample area includes two boxes to show more detailed views. The right-hand side contains zoomed-in images, with “1” and “2” corresponding to the enlarged views within the boxes.</p>
Full article ">Figure 12
<p>Registration accuracy results for different numbers of single-tree feature points.</p>
Full article ">Figure 13
<p>Impact of feature point extraction errors on registration accuracy under different offsets.</p>
Full article ">
21 pages, 17557 KiB  
Article
Lidar Simultaneous Localization and Mapping Algorithm for Dynamic Scenes
by Peng Ji, Qingsong Xu and Yifan Zhao
World Electr. Veh. J. 2024, 15(12), 567; https://doi.org/10.3390/wevj15120567 - 7 Dec 2024
Viewed by 737
Abstract
To address the issue of significant point cloud ghosting in the construction of high-precision point cloud maps by low-speed intelligent mobile vehicles due to the presence of numerous dynamic obstacles in the environment, which affects the accuracy of map construction, this paper proposes [...] Read more.
To address the issue of significant point cloud ghosting in the construction of high-precision point cloud maps by low-speed intelligent mobile vehicles due to the presence of numerous dynamic obstacles in the environment, which affects the accuracy of map construction, this paper proposes a LiDAR-based Simultaneous Localization and Mapping (SLAM) algorithm tailored for dynamic scenes. The algorithm employs a tightly coupled SLAM framework integrating LiDAR and inertial measurement unit (IMU). In the process of dynamic obstacle removal, the point cloud data is first gridded. To more comprehensively represent the point cloud information, the point cloud within the perception area is linearly discretized by height to obtain the distribution of the point cloud at different height layers, which is then encoded to construct a linear discretized height descriptor for dynamic region extraction. To preserve more static feature points without altering the original point cloud, the Random Sample Consensus (RANSAC) ground fitting algorithm is employed to fit and segment the ground point cloud within the dynamic regions, followed by the removal of dynamic obstacles. Finally, accurate point cloud poses are obtained through static feature matching. The proposed algorithm has been validated using open-source datasets and self-collected campus datasets. The results demonstrate that the algorithm improves dynamic point cloud removal accuracy by 12.3% compared to the ERASOR algorithm and enhances overall mapping and localization accuracy by 8.3% compared to the LIO-SAM algorithm, thereby providing a reliable environmental description for intelligent mobile vehicles. Full article
Show Figures

Figure 1

Figure 1
<p>Algorithm framework.</p>
Full article ">Figure 2
<p>(<b>a</b>) Line feature association; (<b>b</b>) surface feature association.</p>
Full article ">Figure 3
<p>Flowchart of dynamic obstacle removal algorithm.</p>
Full article ">Figure 4
<p>Lidar point cloud scanning image.</p>
Full article ">Figure 5
<p>The concentric circle radius formed by LiDAR on the horizontal plane.</p>
Full article ">Figure 6
<p>Schematic diagram of point cloud encoding.</p>
Full article ">Figure 7
<p>Encoding diagram.</p>
Full article ">Figure 8
<p>Schematic diagram of each grid encoding.</p>
Full article ">Figure 9
<p>Correspondence between keyframes and submaps.</p>
Full article ">Figure 10
<p>Dynamic obstacle recognition point cloud rendering.</p>
Full article ">Figure 11
<p>Experimental platform.</p>
Full article ">Figure 12
<p>Real scene map.</p>
Full article ">Figure 13
<p>Initial point cloud map.</p>
Full article ">Figure 14
<p>(<b>a</b>) Comparison chart of dynamic obstacle removal effects using the ERASOR algorithm; (<b>b</b>) comparison chart of dynamic obstacle removal effects using the algorithm in this paper.</p>
Full article ">Figure 15
<p>Trajectory comparison of gate02 sequence on <span class="html-italic">x</span>-<span class="html-italic">z</span> plane.</p>
Full article ">Figure 16
<p>Trajectory comparison of street02 sequence on <span class="html-italic">x</span>-<span class="html-italic">z</span> plane.</p>
Full article ">Figure 17
<p>(<b>a</b>) ALOAM algorithm’s trajectory error chart for the gate02 sequence; (<b>b</b>) LeGO-LOAM algorithm’s trajectory error chart for the gate02 sequence. (<b>c</b>) LIO-SAM algorithm’s trajectory error Chart for the gate02 sequence; (<b>d</b>) our algorithm’s trajectory error chart for the gate02 sequence.</p>
Full article ">Figure 18
<p>(<b>a</b>) ALOAM algorithm’s trajectory error chart for the street02 sequence; (<b>b</b>) LeGO-LOAM algorithm’s trajectory error chart for the street02 sequence; (<b>c</b>) LIO-SAM algorithm’s trajectory error chart for the street02 sequence; (<b>d</b>) our algorithm’s trajectory error chart for the street02 sequence.</p>
Full article ">Figure 18 Cont.
<p>(<b>a</b>) ALOAM algorithm’s trajectory error chart for the street02 sequence; (<b>b</b>) LeGO-LOAM algorithm’s trajectory error chart for the street02 sequence; (<b>c</b>) LIO-SAM algorithm’s trajectory error chart for the street02 sequence; (<b>d</b>) our algorithm’s trajectory error chart for the street02 sequence.</p>
Full article ">Figure 19
<p>(<b>a</b>) Scene one from campus self-collected dataset; (<b>b</b>) scene two from campus self-collected dataset.</p>
Full article ">Figure 19 Cont.
<p>(<b>a</b>) Scene one from campus self-collected dataset; (<b>b</b>) scene two from campus self-collected dataset.</p>
Full article ">Figure 20
<p>(<b>a</b>) Trajectory comparison chart for scene one; (<b>b</b>) trajectory comparison chart for scene two.</p>
Full article ">Figure 21
<p>(<b>a</b>) Distribution chart of ATE in different ranges for experiment scene one on campus; (<b>b</b>) distribution chart of ATE in different ranges for experiment scene two on campus.</p>
Full article ">
16 pages, 21810 KiB  
Article
Enhancing Direct Georeferencing Using Real-Time Kinematic UAVs and Structure from Motion-Based Photogrammetry for Large-Scale Infrastructure
by Soohee Han and Dongyeob Han
Drones 2024, 8(12), 736; https://doi.org/10.3390/drones8120736 - 5 Dec 2024
Viewed by 731
Abstract
The growing demand for high-accuracy mapping and 3D modeling using unmanned aerial vehicles (UAVs) has accelerated advancements in flight dynamics, positioning accuracy, and imaging technology. Structure from motion (SfM), a computer vision-based approach, is increasingly replacing traditional photogrammetry through facilitating the automation of [...] Read more.
The growing demand for high-accuracy mapping and 3D modeling using unmanned aerial vehicles (UAVs) has accelerated advancements in flight dynamics, positioning accuracy, and imaging technology. Structure from motion (SfM), a computer vision-based approach, is increasingly replacing traditional photogrammetry through facilitating the automation of processes such as aerial triangulation (AT), terrain modeling, and orthomosaic generation. This study examines methods to enhance the accuracy of SfM-based AT through real-time kinematic (RTK) UAV imagery, focusing on large-scale infrastructure applications, including a dam and its entire basin. The target area, primarily consisting of homogeneous water surfaces, poses considerable challenges for feature point extraction and image matching, which are crucial for effective SfM. To overcome these challenges and improve the AT accuracy, a constraint equation was applied, incorporating weighted 3D coordinates derived from RTK UAV data. Furthermore, oblique images were combined with nadir images to stabilize AT, and confidence-based filtering was applied to point clouds to enhance geometric quality. The results indicate that assigning appropriate weights to 3D coordinates and incorporating oblique imagery significantly improve the AT accuracy. This approach presents promising advancements for RTK UAV-based AT in SfM-challenging, large-scale environments, thus supporting more efficient and precise mapping applications. Full article
Show Figures

Figure 1

Figure 1
<p>Four scenarios for nadir–oblique combined photography: (<b>a</b>) a single grid for each direction, (<b>b</b>) sequential nadir/2–oblique shots in a double grid, (<b>c</b>) sequential nadir/4–oblique shots in a single grid, and (<b>d</b>) omnidirectional shots in a single grid.</p>
Full article ">Figure 2
<p>Common procedures of SfM.</p>
Full article ">Figure 3
<p>Overview of Site 1 (basemap generated using the V-World API).</p>
Full article ">Figure 4
<p>Overview of Site 2 (basemap generated using the V-World API).</p>
Full article ">Figure 5
<p>Failed images at Site 1: (<b>a</b>) image locations; (<b>b</b>) a sample image.</p>
Full article ">Figure 6
<p>Locations of checkpoints: (<b>a</b>) Site 1; (<b>b</b>) Site 2.</p>
Full article ">Figure 7
<p>Point clouds with error points from a horizontal view at Site 1: (<b>a</b>) before filtering; (<b>b</b>) after filtering.</p>
Full article ">Figure 8
<p>Point clouds with error points from a perspective view at Site 1: (<b>a</b>) before filtering; (<b>b</b>) after filtering.</p>
Full article ">
24 pages, 13141 KiB  
Article
Robust and Efficient Registration of Infrared and Visible Images for Vehicular Imaging Systems
by Kai Che, Jian Lv, Jiayuan Gong, Jia Wei, Yun Zhou and Longcheng Que
Remote Sens. 2024, 16(23), 4526; https://doi.org/10.3390/rs16234526 - 3 Dec 2024
Viewed by 472
Abstract
The automatic registration of infrared and visible images in vehicular imaging systems remains challenging in vision-assisted driving systems because of differences in imaging mechanisms. Existing registration methods often fail to accurately register infrared and visible images in vehicular imaging systems due to numerous [...] Read more.
The automatic registration of infrared and visible images in vehicular imaging systems remains challenging in vision-assisted driving systems because of differences in imaging mechanisms. Existing registration methods often fail to accurately register infrared and visible images in vehicular imaging systems due to numerous spurious points during feature extraction, unstable feature descriptions, and low feature matching efficiency. To address these issues, a robust and efficient registration of infrared and visible images for vehicular imaging systems is proposed. In the feature extraction stage, we propose a structural similarity point extractor (SSPE) that extracts feature points using the structural similarity between weighted phase congruency (PC) maps and gradient magnitude (GM) maps. This approach effectively suppresses invalid feature points while ensuring the extraction of stable and reliable ones. In the feature description stage, we design a rotation-invariant feature descriptor (RIFD) that comprehensively describes the attributes of feature points, thereby enhancing their discriminative power. In the feature matching stage, we propose an effective coarse-to-fine matching strategy (EC2F) that improves the matching efficiency through nearest neighbor matching and threshold-based fast sample consensus (FSC), while improving registration accuracy through coordinate-based iterative optimization. Registration experiments on public datasets and a self-established dataset demonstrate the superior performance of our proposed method, and also confirm its effectiveness in real vehicular environments. Full article
Show Figures

Figure 1

Figure 1
<p>Vehicular imaging systems equipped with infrared and visible cameras. The photograph was taken in the laboratory, with the equipment fixed for debugging.</p>
Full article ">Figure 2
<p>The framework of our method.</p>
Full article ">Figure 3
<p>The weighted PC and GM maps of the infrared and visible images are shown. (<b>a</b>) The green box highlights the selected area of the infrared image, with red for the weighted PC map and yellow for the GM map. (<b>b</b>) The green box highlights the selected area of the visible image, with red for the weighted PC map and yellow for the GM map.</p>
Full article ">Figure 4
<p>Schematic diagram of feature point dominant orientation angle estimation. The red rectangle within the circular region represents the feature point, while the green arrows represent the direction of the neighboring pixels. In the enlarged local red dotted box, the red asterisks on the green histogram denote three known coordinate points used to fit the quadratic curve. Additionally, the red asterisk at the intersection with the horizontal axis represents the estimated direction angle <math display="inline"><semantics> <msub> <mi>θ</mi> <mrow> <mi>D</mi> <mi>o</mi> <mi>m</mi> <mi>i</mi> <mi>n</mi> <mi>a</mi> <mi>n</mi> <mi>t</mi> </mrow> </msub> </semantics></math>.</p>
Full article ">Figure 5
<p>Schematic diagram of rotation invariant feature descriptor.</p>
Full article ">Figure 6
<p>Sample pairs of infrared and visible images.</p>
Full article ">Figure 7
<p>Qualitative comparison results on the sample data.</p>
Full article ">Figure 8
<p>Registration fusion results on the four infrared and visible image pairs. The small red and yellow boxes in each picture indicate the selected areas, while the larger red and yellow boxes at the bottom present the corresponding enlarged views.</p>
Full article ">Figure 9
<p>Visual comparison with the M3FD dataset.</p>
Full article ">Figure 10
<p>Effect of SSPE on matching. (<b>a</b>) In the infrared image, the green box shows the selected area, with red for the weighted PC map and yellow for the GM map. (<b>b</b>) In the visible image, the green box shows the corresponding area from (<b>a</b>), with red for the weighted PC map and yellow for the GM map. (<b>c</b>) Feature points and matching results without SSPE. (<b>d</b>) Feature points and matching results with SSPE.</p>
Full article ">Figure 11
<p>Rotational performance metrics on seven datasets.</p>
Full article ">Figure 12
<p>Visualization of matching with rotational invariance.</p>
Full article ">Figure 13
<p>Quantitative comparison of descriptor substitution.</p>
Full article ">Figure 14
<p>The effect of coarse-to-fine matching. (<b>a</b>) Matches in the infrared image. (<b>b</b>) Coarse matches in the visible image. and (<b>c</b>) Fine matches in the visible image.</p>
Full article ">Figure 15
<p>The registration process in a vehicular imaging system.</p>
Full article ">Figure 16
<p>Registration results for various scenes in the vehicular imaging system.</p>
Full article ">
18 pages, 8489 KiB  
Article
Tightly Coupled SLAM Algorithm Based on Similarity Detection Using LiDAR-IMU Sensor Fusion for Autonomous Navigation
by Jiahui Zheng, Yi Wang and Yadong Men
World Electr. Veh. J. 2024, 15(12), 558; https://doi.org/10.3390/wevj15120558 - 2 Dec 2024
Viewed by 556
Abstract
In recent years, the rise of unmanned technology has made Simultaneous Localization and Mapping (SLAM) algorithms a focal point of research in the field of robotics. SLAM algorithms are primarily categorized into visual SLAM and laser SLAM, based on the type of external [...] Read more.
In recent years, the rise of unmanned technology has made Simultaneous Localization and Mapping (SLAM) algorithms a focal point of research in the field of robotics. SLAM algorithms are primarily categorized into visual SLAM and laser SLAM, based on the type of external sensors employed. Laser SLAM algorithms have become essential in robotics and autonomous driving due to their insensitivity to lighting conditions, precise distance measurements, and ease of generating navigation maps. Throughout the development of SLAM technology, numerous effective algorithms have been introduced. However, existing algorithms still encounter challenges, such as localization errors and suboptimal utilization of sensor data. To address these issues, this paper proposes a tightly coupled SLAM algorithm based on similarity detection. The algorithm integrates Inertial Measurement Unit (IMU) and LiDAR odometry modules, employs a tightly coupled processing approach for sensor data, and utilizes curvature feature optimization extraction methods to enhance the accuracy and robustness of inter-frame matching. Additionally, the algorithm incorporates a local keyframe sliding window method and introduces a similarity detection mechanism, which reduces the real-time computational load and improves efficiency. Experimental results demonstrate that the algorithm achieves superior performance, with reduced positioning errors and enhanced global consistency, in tests conducted on the KITTI dataset. The accuracy of the real trajectory data compared to the ground truth is evaluated using metrics such as ATE (absolute trajectory error) and RMSE (root mean square error). Full article
(This article belongs to the Special Issue Motion Planning and Control of Autonomous Vehicles)
Show Figures

Figure 1

Figure 1
<p>Algorithm flowchart based on feature optimization.</p>
Full article ">Figure 2
<p>Schematic diagram of the vehicle coordinate system (O<sub>B</sub>X<sub>B</sub>Y<sub>B</sub>Z<sub>B</sub>) and the LiDAR coordinate system (O<sub>L</sub>X<sub>L</sub>Y<sub>L</sub>Z<sub>L</sub>).</p>
Full article ">Figure 3
<p>This figure is a schematic diagram of the LiDAR point cloud extraction process.</p>
Full article ">Figure 4
<p>This is the schematic diagram of Hausdorff distance.</p>
Full article ">Figure 5
<p>This is the schematic diagram of cumulative error in loop detection.</p>
Full article ">Figure 6
<p>This is the loop detection and determination flow chart.</p>
Full article ">Figure 7
<p>(<b>a</b>) The trajectories of the algorithm in this article are compared with other algorithms on the KITTI dataset 07 sequence and (<b>b</b>) the trajectories of this algorithm are compared with other algorithms on the KITTI dataset 09 sequence.</p>
Full article ">Figure 8
<p>(<b>a</b>) The deviation in the algorithm in this article is compared with other algorithms on the x, y, and z axes of the KITTI dataset 07 sequence and (<b>b</b>) the deviation in the algorithm in this article is compared with other algorithms on the x, y, and z axes of the KITTI dataset 09 sequence.</p>
Full article ">Figure 9
<p>This is the vehicle acquisition platform.</p>
Full article ">Figure 10
<p>This is a satellite map of the outdoor scene.</p>
Full article ">Figure 11
<p>This figure is the graph of the algorithm in this paper.</p>
Full article ">Figure 12
<p>(<b>a</b>) A detailed comparison diagram of the LIO-SAM algorithm, (<b>b</b>) the algorithm in this paper, and (<b>c</b>) the Point-LIO algorithm.</p>
Full article ">Figure 13
<p>The figure shows a comparison of trajectory errors between the three algorithms.</p>
Full article ">
17 pages, 4225 KiB  
Article
Integrating Metabolomics Domain Knowledge with Explainable Machine Learning in Atherosclerotic Cardiovascular Disease Classification
by Everton Santana, Eliana Ibrahimi, Evangelos Ntalianis, Nicholas Cauwenberghs and Tatiana Kuznetsova
Int. J. Mol. Sci. 2024, 25(23), 12905; https://doi.org/10.3390/ijms252312905 - 30 Nov 2024
Viewed by 420
Abstract
Metabolomic data often present challenges due to high dimensionality, collinearity, and variability in metabolite concentrations. Machine learning (ML) application in metabolomic analyses is enabling the extraction of meaningful information from complex data. Bringing together domain-specific knowledge from metabolomics with explainable ML methods can [...] Read more.
Metabolomic data often present challenges due to high dimensionality, collinearity, and variability in metabolite concentrations. Machine learning (ML) application in metabolomic analyses is enabling the extraction of meaningful information from complex data. Bringing together domain-specific knowledge from metabolomics with explainable ML methods can refine the predictive performance and interpretability of models used in atherosclerosis research. In this work, we aimed to identify the most impactful metabolites associated with the presence of atherosclerotic cardiovascular disease (ASCVD) in cross-sectional case–control studies using explainable ML methods integrated with metabolomics domain knowledge. For this, a subset from the FLEMENGHO cohort with metabolomic data available was used as the training cohort, including 63 patients with a history of ASCVD and 52 non-smoking controls matched by age, sex, and body mass index from the same population. First, Partial Least Squares Discriminant Analysis (PLS-DA) was applied for dimensionality reduction. The selected metabolites’ correlations were analyzed by considering their chemical categorization. Then, eXtreme Gradient Boosting (XGBoost) was used to identify metabolites that characterize ASCVD. Next, the selected metabolites were evaluated in an external cohort to determine their effectiveness in distinguishing between cases and controls. A total of 56 metabolites were selected for ASCVD discrimination using PLS-DA. The primary identified metabolites’ superclasses included lipids, organic acids, and organic oxygen compounds. Upon integrating these metabolites with the XGBoost model, the classification yielded a test area under the curve (AUC) of 0.75. SHAP analyses ranked cholesterol, 3-methylhistidine, and glucuronic acid among the most impactful features and showed the diversity of metabolites considered for building the ASCVD discriminator. Also using XGBoost, the selected metabolites achieved an AUC of 0.93 in an independent external validation cohort. In conclusion, the combination of different metabolites has the potential to build classifiers for ASCVD. Integrating metabolite categorization within the SHAP analysis further enhanced the interpretability of the model, offering insights into metabolite-specific contributions to ASCVD risk. Full article
Show Figures

Figure 1

Figure 1
<p><b>The selected metabolites’ (A) superclass distribution and (B) Spearman’s correlation network.</b> In the network, the nodes correspond to the metabolites and the edges depend on the strength of their Spearman’s correlation between two nodes. Thicker and darker edges indicate a higher pairwise correlation, whereas thinner and lighter colors indicate a lower correlation. Red edges correspond to negative correlations and blue edges to positive ones. The node colors specify the metabolite superclass, and its size increases according to the absolute strength of the edges connected to it. Metabolites marked with * represent those available also in the external validation dataset. For visualization purposes, the correlations were powered to 4 but kept the original signal. In this figure, CA stands for caproic acid.</p>
Full article ">Figure 2
<p>Metabolites’ superclass-informed Shapley analysis (SHAP) of the eXtreme Gradient Boosting model in the FLEMENGHO cohort with the 56 selected features. Positive SHAP values are positively associated with the ASCVD classification. Metabolites marked with * represent those that are also available in the external validation dataset. The colors of the metabolites correspond to their superclasses, as shown in <a href="#ijms-25-12905-f001" class="html-fig">Figure 1</a>.</p>
Full article ">Figure 3
<p><b>Shapley analysis (SHAP) of eXtreme Gradient Boosting per the metabolite’s superclass in the training FLEMENGHO set.</b> Positive SHAP values are positively associated with the ASCVD classification. Other superclasses in the panel include organoheterocyclic compounds (pink); organic nitrogen compounds (grey); nucleosides, nucleotides, and analogues (blue); homogeneous non-metal compounds (purple); and alkaloids and derivatives (red). The values in the subtitles correspond to the weighted ROC AUC during cross-validation of the training set and after hyperparameter optimization of the test set. Metabolites marked with * represent those that are also available in the external validation dataset.</p>
Full article ">Figure 4
<p><b>Analysis pipeline.</b> In the training cohort (FLEMENGHO), we first identified relevant metabolites to distinguish between atherosclerotic cardiovascular disease (ASCVD) cases and controls. The metabolites were selected from Partial Least Squares Discriminant Analysis (PLS-DA) and then used in eXtreme Gradient Boosting (XGBoost). Next, explainable machine learning of Shapley values (SHAP) with metabolites’ categorization was explored. After that, in an external cohort, we evaluated the same metabolites to distinguish between ischemic heart disease (IHD) cases and controls. In the figure, M stands for the number of metabolites.</p>
Full article ">
24 pages, 7259 KiB  
Article
A Pseudo-Waveform-Based Method for Grading ICESat-2 ATL08 Terrain Estimates in Forested Areas
by Rong Zhao, Qing Hu, Zhiwei Liu, Yi Li and Kun Zhang
Forests 2024, 15(12), 2113; https://doi.org/10.3390/f15122113 - 28 Nov 2024
Viewed by 586
Abstract
The ICESat-2 Land and Vegetation Height (ATL08) product is a new control point dataset for large-scale topographic mapping and geodetic surveying. However, its elevation accuracy is typically affected by multiple factors. The study aims to propose a new approach to classify ATL08 terrain [...] Read more.
The ICESat-2 Land and Vegetation Height (ATL08) product is a new control point dataset for large-scale topographic mapping and geodetic surveying. However, its elevation accuracy is typically affected by multiple factors. The study aims to propose a new approach to classify ATL08 terrain estimates into different accuracy levels and extract reliable ground control points (GCPs) from ICESat-2 ATL08. Specifically, the methodology is divided into three stages. First, the ATL08 terrain estimates are matched with the raw ATL03 photon cloud data, and the ATL08 terrain estimates are used to fit a continuous terrain curve. Then, using the fitted continuous terrain curve and raw ATL03 photon cloud data, a pseudo-waveform is generated for grading the ATL08 terrain estimates. Finally, all the ATL08 terrain estimates are graded based on the peak characteristics of the generated pseudo-waveform. To validate the feasibility of the proposed method, four study areas from the National Ecological Observatory Network (NEON), characterized by various terrain features and forest types were selected. High-accuracy airborne lidar data were used to evaluate the accuracy of graded ICESat-2 terrain estimates. The results demonstrate that the method effectively classified all ATL08 terrain estimates into different accuracy levels and successfully extracted high-accuracy GCPs. The root mean square errors (RMSEs) of the first accuracy level in the four selected study areas were 0.99 m, 0.51 m, 1.88 m, and 0.65 m, representing accuracy improvement of 51.7%, 58.2%, 83.1%, and 68.8%, respectively, compared to the original ATL08 terrain estimates before classifying. Additionally, a comparison with the conventional threshold-based GCP extraction method demonstrated the superior performance of our proposed approach. This study introduces a new approach to extract high-quality elevation control points from ICESat-2 ATL08 data, particularly in forested areas. Full article
Show Figures

Figure 1

Figure 1
<p>Geolocation of the study areas: (<b>a</b>) Mountain Lake Biological Station (MLBS), (<b>b</b>) Dead Lake (DELA), (<b>c</b>) Great Smoky Mountains National Park (GRSM), (<b>d</b>) Treehaven (TREE). Blue dots in the images represent the ICESat-2 data strips over each study area.</p>
Full article ">Figure 2
<p>Flowchart of the proposed method for grading ATL08 terrain estimates.</p>
Full article ">Figure 3
<p>Schematic diagram of the pseudo-waveform derived by combining ATL08 and ATL03.</p>
Full article ">Figure 4
<p>(<b>a</b>) Distribution of the ATL03 photons within a statistical buffer zone; (<b>b</b>) schematic representation of the pseudo-waveform, displaying three largest peaks: Lgst_peak, Sec_peak, and Thd_peak.</p>
Full article ">Figure 5
<p>Distribution of graded ICESat-2 ATL08 terrain estimates across varying terrain slopes.</p>
Full article ">Figure 6
<p>Distribution of graded ICESat-2 ATL08 terrain estimates across varying VCFs.</p>
Full article ">Figure 7
<p>Distribution of raw ATL08 terrain estimates and graded estimates L1~L3 in the study area of (<b>a1</b>–<b>a4</b>) DELA, (<b>b1</b>–<b>b4</b>) MLBS, (<b>c1</b>–<b>c4</b>) GRSM, and (<b>d1</b>–<b>d4</b>) TREE. White dots in the images represent the ICESat-2 data points.</p>
Full article ">Figure 8
<p>Scatterplots of both raw and graded ATL08 terrain estimates (L1~L3) in (<b>a</b>) DELA, (<b>b</b>) MLBS, (<b>c</b>) GRSM, and (<b>d</b>) TREE.</p>
Full article ">Figure 9
<p>(<b>a</b>) RMSE and (<b>b</b>) data retention rate of the terrain estimates graded by our proposed method and the threshold-based method. T represents the L1 estimate achieved by the threshold-based method. W1 represents our proposed method’s L1 estimate, while W2 indicates its combined L1 and L2 estimates.</p>
Full article ">Figure 10
<p>Statistical distribution of elevation errors in different terrain slope classes at three accuracy levels. For each slope class, the boxplot illustrates the minimum, maximum, median, first, and third quartile values of the data over each accuracy level.</p>
Full article ">Figure 11
<p>Statistical distribution of elevation errors in different terrain VCF classes at three accuracy levels. For each slope class, the boxplot illustrates the minimum, maximum, median, and first and third quartile values of the data at each accuracy level.</p>
Full article ">
18 pages, 16417 KiB  
Article
Joint Object Detection and Multi-Object Tracking Based on Hypergraph Matching
by Zhoujuan Cui, Yuqi Dai, Yiping Duan and Xiaoming Tao
Appl. Sci. 2024, 14(23), 11098; https://doi.org/10.3390/app142311098 - 28 Nov 2024
Viewed by 409
Abstract
Addressing the challenges in online multi-object tracking algorithms under complex scenarios, where the independence among feature extraction, object detection, and data association modules leads to both error accumulation and the difficulty of maintaining visual consistency for occluded objects, we have proposed an end-to-end [...] Read more.
Addressing the challenges in online multi-object tracking algorithms under complex scenarios, where the independence among feature extraction, object detection, and data association modules leads to both error accumulation and the difficulty of maintaining visual consistency for occluded objects, we have proposed an end-to-end multi-object tracking method based on hypergraph matching (JDTHM). Initially, a feature extraction and object detection module is introduced to achieve preliminary localization and description of the objects. Subsequently, a deep feature aggregation module is designed to extract temporal information from historical tracklets, amalgamating features from object detection and feature extraction to enhance the consistency between the current frame features and the tracklet features, thus preventing identity swaps and tracklet breaks caused by object detection loss or distortion. Finally, a data association module based on hypergraph matching is constructed, integrating with object detection and feature extraction into a unified network, transforming the data association problem into a hypergraph matching problem between the tracklet hypergraph and the detection hypergraph, thereby achieving end-to-end model optimization. The experimental results demonstrate that this method has yielded favorable qualitative and quantitative analysis results on three multi-object tracking datasets, thereby validating its effectiveness in enhancing the robustness and accuracy of multi-object tracking tasks. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Input images for the MOT task. (<b>b</b>) Schematic representation of the object tracklets. Existing methods often fail when multiple objects with similar appearance or motion patterns or occlusion appear in proximity.</p>
Full article ">Figure 2
<p>Overview of graph/hypergraph matching pipeline. (<b>a</b>) Graph Matching. (<b>b</b>) Association Graph. (<b>c</b>) Association Hypergraph. The node-to-node matching problem in (<b>a</b>) can therefore be formulated as the node classification task on the association graph, whose edge weights can be induced by the affinity matrix. Similarly, the vertex-to-vertex matching problem can be formulated as a vertex classification task on the associated hypergraph, where the edge weights can be induced by the affinity tensor.</p>
Full article ">Figure 3
<p>Framework overview of proposed method (JDTHM). JDTHM is composed of a feature extraction and object detection module, a feature aggregation module, and a data association module. Different colored circles represent the candidate bboxes and history tracklets.</p>
Full article ">Figure 4
<p>Hypergraph matching process diagram. The computation process of vertex-to-vertex matching relationships in two hypergraphs is translated into a vertex classification task on the associated hypergraph, where edge weights can be induced by the association tensor.</p>
Full article ">Figure 5
<p>Visualization of our results on four sequences of the MOT17. Each row shows the results of sampled frames in chronological order of a video sequence. Bboxes and identities are marked in the images. Bboxes with different colors represent different identities. Best viewed in color.</p>
Full article ">Figure 5 Cont.
<p>Visualization of our results on four sequences of the MOT17. Each row shows the results of sampled frames in chronological order of a video sequence. Bboxes and identities are marked in the images. Bboxes with different colors represent different identities. Best viewed in color.</p>
Full article ">Figure 6
<p>Visualization of our results on three sequences of the MOT20. Bboxes and identities are marked in the images. Bboxes with different colors represent different identities. Best viewed in color.</p>
Full article ">
18 pages, 12610 KiB  
Article
Automatic Registration of Panoramic Images and Point Clouds in Urban Large Scenes Based on Line Features
by Panke Zhang, Hao Ma, Liuzhao Wang, Ruofei Zhong, Mengbing Xu and Siyun Chen
Remote Sens. 2024, 16(23), 4450; https://doi.org/10.3390/rs16234450 - 27 Nov 2024
Viewed by 457
Abstract
As the combination of panoramic images and laser point clouds becomes more and more widely used as a technique, the accurate determination of external parameters has become essential. However, due to the relative position change of the sensor and the time synchronization error, [...] Read more.
As the combination of panoramic images and laser point clouds becomes more and more widely used as a technique, the accurate determination of external parameters has become essential. However, due to the relative position change of the sensor and the time synchronization error, the automatic and accurate matching of the panoramic image and the point cloud is very challenging. In order to solve this problem, this paper proposes an automatic and accurate registration method for panoramic images and point clouds of urban large scenes based on line features. Firstly, the multi-modal point cloud line feature extraction algorithm is used to extract the edge of the point cloud. Based on the point cloud intensity orthoimage (an orthogonal image based on the point cloud’s intensity values), the edge of the road markings is extracted, and the geometric feature edge is extracted by the 3D voxel method. Using the established virtual projection correspondence for the panoramic image, the panoramic image is projected onto the virtual plane for edge extraction. Secondly, the accurate matching relationship is constructed by using the feature constraint of the direction vector, and the edge features from both sensors are refined and aligned to realize the accurate calculation of the registration parameters. The experimental results show that the proposed method shows excellent registration results in challenging urban scenes. The average registration error is better than 3 pixels, and the root mean square error (RMSE) is less than 1.4 pixels. Compared with the mainstream methods, it has advantages and can promote the further research and application of panoramic images and laser point clouds. Full article
Show Figures

Figure 1

Figure 1
<p>The framework of the registration.</p>
Full article ">Figure 2
<p>Virtual projection segmentation of panoramic image. (<b>A</b>) Front view after projection; (<b>B</b>) Right view after projection; (<b>C</b>) Back view after projection; (<b>D</b>) Left view after projection.</p>
Full article ">Figure 3
<p>Transformation of panoramic image and point cloud coordinate.</p>
Full article ">Figure 4
<p>Overview of the experimental areas. (<b>a</b>) Beijing; (<b>b</b>) Guangzhou; (<b>c</b>) Hong Kong.</p>
Full article ">Figure 5
<p>Point of cloud road marking edge detection: (<b>a</b>) original point cloud; (<b>b</b>) intensity orthoimage; (<b>c</b>) semantic segmentation; (<b>d</b>) road marking edge points.</p>
Full article ">Figure 6
<p>Point cloud geometric edge: (<b>a</b>) point cloud voxel; (<b>b</b>) geometric feature edge points.</p>
Full article ">Figure 7
<p>Panoramic segmentation and edge line extraction: (<b>a</b>) directly extracted from the panoramic image; (<b>b</b>) extracted after virtual projection segmentation.</p>
Full article ">Figure 7 Cont.
<p>Panoramic segmentation and edge line extraction: (<b>a</b>) directly extracted from the panoramic image; (<b>b</b>) extracted after virtual projection segmentation.</p>
Full article ">Figure 8
<p>Visualization of the registration process: (<b>a</b>) initial registration; (<b>b</b>) final registration.</p>
Full article ">Figure 9
<p>Panorama and point cloud registration effect diagram: (<b>a</b>) before the algorithm processing; (<b>b</b>) after algorithm processing.</p>
Full article ">Figure 10
<p>The visualization effects of different methods: (<b>a</b>–<b>d</b>) results of the proposed method A–D; (<b>e</b>) the overall effect diagram of method D. The number (<b>1</b>–<b>3</b>) represents the results on datasets I–III.</p>
Full article ">Figure 10 Cont.
<p>The visualization effects of different methods: (<b>a</b>–<b>d</b>) results of the proposed method A–D; (<b>e</b>) the overall effect diagram of method D. The number (<b>1</b>–<b>3</b>) represents the results on datasets I–III.</p>
Full article ">
19 pages, 216336 KiB  
Article
Passive Perception and Path Tracking of Tourists in Mountain Scenic Spots Through Face to Body Two Stepwise Method
by Fan Yang, Changming Zhu, Kuntao Shi, Junli Li, Qian Shen and Xin Zhang
ISPRS Int. J. Geo-Inf. 2024, 13(12), 423; https://doi.org/10.3390/ijgi13120423 - 25 Nov 2024
Viewed by 494
Abstract
Tourists’ near-field passive perception and identification in mountain areas faces challenges related to long distances, small targets, varied-pose scenarios, facial occlusion, etc. To address this issue, this paper proposes an innovative technical framework based on a face-to-body (F2B) two-step iterative method aimed at [...] Read more.
Tourists’ near-field passive perception and identification in mountain areas faces challenges related to long distances, small targets, varied-pose scenarios, facial occlusion, etc. To address this issue, this paper proposes an innovative technical framework based on a face-to-body (F2B) two-step iterative method aimed at enhancing the passive perception and tracking of tourists in complex mountain environments by integrating and coordinating body features with facial features. The F2B technical framework comprises three main components: target feature acquisition, multi-feature coupled re-identification, and target positioning and tracking. Initially, the faces and bodies of tourists are extracted from real-time video streams using the RetinaFace and YOLOX models, respectively. The ArcFace model is then employed to extract the facial features of the target tourists, linking them with the faces detected by RetinaFace. Subsequently, a multi-feature database is constructed using the Hungarian algorithm to facilitate the automatic matching of the face and body of the same tourist. Finally, the Fast-ReID model and a spatial position algorithm are utilized for the re-identification of tourist targets and tracking their dynamic paths. Based on public and actual scene datasets, deployment and testing in the Yimeng Mountain Scenic Area have demonstrated that the accuracy index AP of the F2B model reaches 88.03%, with a recall of 90.28%, achieving an overall identification accuracy of approximately 90% and a false alarm rate of less than 5%. This result significantly improves the accuracy of SOTA facial recognition models in the complex environments of mountainous scenic spots. It effectively addresses the challenges associated with the low identification accuracy of non-cooperative targets in these areas through a ground video sensing network. Furthermore, it offers technical support for spatiotemporal information regarding near-field passive perception and path tracking of tourists in mountain scenic spots and showcasing broad application prospects. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Research area and test site.</p>
Full article ">Figure 2
<p>The framework of the F2B technology.</p>
Full article ">Figure 3
<p>The network framework for target tourist real-time feature of set collection.</p>
Full article ">Figure 4
<p>The architecture of tourist re-identification based on multi-feature coupling.</p>
Full article ">Figure 5
<p>Tourist identification based on the F2B model in different scenarios. ((<b>A</b>,<b>D</b>) represent the identification results of targets with different poses in multi-tourist scenarios. (<b>B</b>) shows the identification results when the target is looking down, and (<b>C</b>) represents the identification results when the target’s face is occluded.)</p>
Full article ">Figure 6
<p>Results of tourist identificaiton comparison among Facenet, Cosface, ArcFace, Adaface, and F2B algorithms in the scene of long-range small targets. The subfigures provide enlarged displays of the recognition results in the figure.</p>
Full article ">Figure 7
<p>Results of tourist identificaiton comparison among Facenet, Cosface, ArcFace, Adaface and F2B algorithms in the scene of varied poses. The subfigures provide enlarged displays of the recognition results in the figure.</p>
Full article ">Figure 8
<p>Results of tourist identificaiton comparison among Facenet, Cosface, ArcFace, Adaface, and F2B algorithms in the scene of facial occlusion. The subfigures provide enlarged displays of the recognition results in the figure.</p>
Full article ">Figure 9
<p>Results of target tourist positioning and tracking ((<b>A</b>) shows that the video positioning algorithm maps the coordinates of the target tourist in the monitoring screen to the real-world effect, and the red trajectory in (<b>B</b>) is the video screen passive positioning trajectory connected to the set of target tourists’ coordinate points; the blue trajectory is the satellite active positioning trajectory).</p>
Full article ">Figure 10
<p>Comparative analysis of multi-pose non-cooperative target recognition results (Pose A: sideways, Pose B: turned around, Pose C: head down).</p>
Full article ">Figure 11
<p>Error source and interference factor analysis ((<b>a</b>) is the result of face recognition only, (<b>b</b>) is the result of body, (<b>c</b>) is the precise extraction and separation of external feature contamination of the target tourist, and (<b>d</b>) is the result of F2B recognition after supplemental features).</p>
Full article ">
Back to TopTop