Open AccessArticle

Robot Localization Method Based on Multi-Sensor Fusion in Low-Light Environment

Mengqi Wang

Zengzeng Lian

^1,*,

María Amparo Núñez-Andrés

Penghui Wang

¹,

Yalin Tian

¹,

Zhe Yue

¹ and

Lingxiao Gu

School of Surveying and Land Information Engineering, Henan Polytechnic University, Jiaozuo 454003, China

Department of Civil and Environmental Engineering, Universitat Politècnica de Catalunya-BarcelonaTech, 08034 Barcelona, Spain

Author to whom correspondence should be addressed.

Electronics 2024, 13(22), 4346; https://doi.org/10.3390/electronics13224346

Submission received: 10 September 2024 / Revised: 25 October 2024 / Accepted: 4 November 2024 / Published: 6 November 2024

Download

Browse Figures

Figure 1
Algorithm procedure. "> Figure 2
Selection of image enhancement algorithm parameters. (a) Fixed low-frequency gain at 0.5, sharpening coefficient at 1, and contrast threshold set to 4. (b) Fixed high-frequency gain at 1.6, sharpening coefficient of 1, and contrast threshold of 4. (c) High-frequency gain is set to 1.6, low-frequency gain to 0.3, and the contrast threshold to 4. (d) The fixed high-frequency gain is 1.6, the low-frequency gain is 0.3, and the sharpening coefficient is 1.5. "> Figure 3
Comparison of feature point extraction effect. (a) The feature point extraction result of the original image. (b) The feature point extraction result after CLAHE processing. (c) The feature point extraction result after homomorphic filtering processing. (d) The feature point extraction result after both CLAHE processing and homomorphic filtering processing. "> Figure 4
Estimation of gyroscope bias coefficients on the MH02 and V203 sequences. (a) Variation in gyroscope bias for L-MSCKF and MSCKF-VIO on the MH02 sequence. (b) Estimated gyroscope bias values by L-MSCKF and MSCKF-VIO on the V203 sequence. "> Figure 5
The trajectory of the algorithm on sequences V103 and V203 of the EuRoC dataset. (a) The trajectory on the V103 sequence. (b) The X, Y, and Z triaxial values on the V103 sequence. (c) The trajectory on the V203 sequence. (d) The X, Y, and Z triaxial values on the V203 sequence. "> Figure 6
Comparison of absolute trajectory errors of each algorithm on weak light sequence V203. "> Figure 7
Comparison of the computational efficiency of each algorithm. (a) Average CPU usage in % of the total available CPU, by the algorithms running the same experiment. (b) Total running time of each algorithm on the same dataset. ">

Versions Notes

Abstract

When robots perform localization in indoor low-light environments, factors such as weak and uneven lighting can degrade image quality. This degradation results in a reduced number of feature extractions by the visual odometry front end and may even cause tracking loss, thereby impacting the algorithm’s positioning accuracy. To enhance the localization accuracy of mobile robots in indoor low-light environments, this paper proposes a visual inertial odometry method (L-MSCKF) based on the multi-state constraint Kalman filter. Addressing the challenges of low-light conditions, we integrated Inertial Measurement Unit (IMU) data with stereo vision odometry. The algorithm includes an image enhancement module and a gyroscope zero-bias correction mechanism to facilitate feature matching in stereo vision odometry. We conducted tests on the EuRoC dataset and compared our method with other similar algorithms, thereby validating the effectiveness and accuracy of L-MSCKF.

Keywords:

multi-sensor fusion; multi-state constraint Kalman filter; visual inertial odometer; image enhancement; gyroscope zero-bias correction mechanism; low-light environment

1. Introduction

Mobile robotics has drawn more interest lately [1]. Currently, the autonomous positioning methods of mobile robots in unknown environments are primarily classified into two groups: one is to obtain the real-time position of mobile robots by measuring the spatial distance of multiple satellites with the assistance of global satellite positioning technology [2], such as the Global Positioning System (GPS), BeiDou Navigation Satellite System (BDS), etc. With this option, when the robot works in an indoor environment, due to the weakening of the GPS signal or the lack of hardware platform constraints, the robot will fail to provide accurate location information for itself [3]. Meanwhile, the effective environmental information around cannot be obtained, which reduces the reliability of navigation, and the related functions of path planning, obstacle avoidance, control, etc., cannot be realized. The other is to use sensors to realize the active positioning of the mobile robot, such as Simultaneous Localization and Mapping (SLAM), which utilizes lightweight and low-cost sensors to gather data about the surroundings while the robot moves, thereby achieving autonomous positioning [4]. Nowadays, with the continuous expansion of robot applications, it has many application requirements in the indoor environment where satellite signals are blocked. As a result, SLAM-based active positioning technology has started to be applied broadly.

SLAM is the essential technology for self-navigating mobile robots [5]. Based on the different applications of sensors, three main groups can be used to classify the techniques: vision-based, lidar-based, and multi-sensor fusion approaches [6]. The former is also called visual odometry, and its main idea is to obtain its position and posture by calculating and analyzing the visual data that the visual sensor has captured [7]. Compared with lidars, visual cameras have more advantages in price and weight [8]. Additionally, when the robot is moving indoors, it is essential to take into account the features of the setting, such as a narrow field of vision. When there is insufficient light in the environment, it will affect the autonomous positioning of the robot. The usual practice is to fuse more sensor data to enhance the stability of the algorithm [9]. For instance, by determining the machine’s acceleration and angular velocity, the Inertial Measurement Unit (IMU) makes up for the loss in optical sensor data under low light levels.

The visual inertial odometry (VIO) technique relying on the integration of vision and inertial navigation provides an effective way for SLAM technology to achieve miniaturization and low cost attributed to its small size, high economy, and strong robustness and applicability to various motion scenarios [10]. As a classical filtering method in VIO, the Multi-State-Constraint Kalman Filter (MSCKF) [11] significantly reduces the state parameters of the filter input, effectively addresses the issue that the computational complexity increases with time, and enhances both robustness and convergence. When compared to the optimized VIO algorithm, the MSCKF not only matches its accuracy but also boasts quicker performance, making it ideal for deployment on embedded systems with constrained computing capabilities. However, due to the presence of IMU bias [12]. The system’s pose estimation accuracy will be impacted. Additionally, the computing resources of ordinary mobile robot platforms are relatively limited. How to use limited resources to alleviate the impact of insufficient ambient light and ensure the real-time and accuracy of robot positioning algorithms has great research significance.

In order to tackle the challenges of VIO measurements under low-light scenarios, this paper introduces a calculation technique known as L-MSCKF, specifically designed to operate effectively in such dimly lit environments. Based on the MSCKF framework, two key enhancements are implemented to adapt to indoor settings with inadequate lighting. The core innovations are as follows:

(1): Application of image enhancement technology. The integration of homomorphic filtering and Contrast Limited Adaptive Histogram Equalization (CLAHE) technologies enhances the brightness and clarity of images captured by the camera in low-light conditions. The performance of the vision sensor in these kinds of settings is greatly enhanced by this method.
(2): Introduction of the complementary Kalman filter. To address the impact of IMU zero-bias on the accuracy of attitude estimation, we innovatively integrate a complementary Kalman filter into our approach. This filter leverages accelerometer data to correct the gyroscope bias, and by integrating this correction into the MSCKF framework, we improve the system’s initial pose estimation’s precision and stability.
(3): System design for indoor low-light environments. The L-MSCKF algorithm is proposed to address the challenge of insufficient lighting in indoor settings. By integrating the aforementioned technologies, it effectively enhances the pose estimation accuracy and robustness of the mobile robot binocular VIO system.

2. Related Work

2.1. Image Enhancement Algorithm

After years of development, the overall framework of visual SLAM [13] has become mature, but most of the current visual SLAM algorithms assume good lighting conditions. The real environment usually involves insufficient light and insufficient illumination. With this scenario, the visual SLAM system often fails [14]. To address this issue, image enhancement is gradually applied to SLAM.

The two primary kinds of image enhancement techniques are those founded on the frequency domain and the spatial domain [15]. Histogram equalization (HE) [16] as well as the Retinex method [17] are examples of spatial domain procedures, whereas the homomorphic filtering approach [18] is an example of frequency-domain techniques. Histogram equalization is a non-linear method that stretches and redistributes image pixel values to achieve a more uniform distribution of pixel intensities across a specified range of gray levels. This process does not take into account the relationships between pixels. While it can enhance image contrast, it may also lead to loss of detail [19]. Zuiderveld et al. proposed CLAHE [20] based on adaptive histogram equalization [21]. The algorithm limits the local area histogram of each sub-block to avoid local noise amplification, uniformly adds the trimmed part to the effective gray level, then performs histogram equalization on each sub-block, and finally, uses the bilinear difference method to solve the block effect problem [22]. The Retinex method assumes that the spatial illumination changes slowly, and halos tend to occur in areas where the brightness of the image changes dramatically [23]. The frequency domain-based homomorphic filtering approach works better for pictures with pronounced lighting variations [24]. Grayscale transformation and the frequency filtering are combined in homomorphic filtering. The lighting reflection structure of the picture serves as the foundation for spectrum processing, which enhances low-frequency data while suppressing high-frequency data. It may successfully address issues with the image’s uneven lighting and excessive dynamic range, enhancing the information in the dark without sacrificing the features in the bright areas [25]. However, global homomorphic filtering operations have a drawback in enhancing image brightness, as they can excessively enhance certain pixels while overlooking local contrast. This can significantly disturb the image in low-light environments. The key to the enhancement process is to retain the image detail information and remove noise as much as possible. Taking this into account, to enhance the visual quality, the concept of frequency domain and spatial domain fusion is used in this paper, which combines homomorphic filtering and CLAHE to meet the application requirements under weak illumination conditions.

2.2. IMU Bias Correction Algorithm

In complex and variable environments, the incorporation of additional sensors like the IMU and wheel odometers can supply additional observational data for SLAM system pose estimation, thereby enhancing the precision and robustness of the algorithm. VIO [26] is widely used to solve the localization problem of robots in unknown environments [27] owing to its advantages of high precision, high real-time performance, relying only on the sensors carried by itself, and no need to interact with the environment. Using the IMU pre-integration approach [28] for integrating the IMU measurements from two adjacent pictures and jointly optimizing the IMU posture, Qin et al. [29] created a publicly available monocular visual inertial state SLAM system (VINS-mono). For filtering-based VIO, Mourikis et al. [30] introduced a multi-state constraint Kalman filter for filtering-based VIO that decreases the computing cost and improves resilience by eliminating the peculiarities of the system state through the use of the left null space projection approach. Accordingly, several derivative works have been conducted to increase the accuracy of MSCKF [11]. However, the existence of IMU bias [12] affects the pose estimation accuracy of the system. The conventional method of correcting the IMU bias coefficient online [31] involves using Zero Velocity Update (ZUPT) [32] and Non-Holonomic Constraints (NHCs) [33]. However, the ZUPT model can only be applied when the IMU is in a static state, and the key is whether the carrier can be accurately judged in a static state. In actuality, it is challenging to assess the observed acceleration values with accuracy and determine if the gyroscope surpasses the threshold. The NHC model relies on the robot kinematics model. It has a strong advantage only for ground mobile robots, and it is not suitable for six-degree-of-freedom motion platforms such as drones. The accelerometer and gyroscope are complementing sensors used in inertial navigation systems and UAVs attitude calculation [34]. In general, the accelerometer has strong static stability, while the gyroscope is susceptible to high-frequency movements. Therefore, when the IMU component moves smoothly, the measured value from the accelerator may be utilized to rectify the gyroscope’s deviation.

3. Materials and Methods

Figure 1 depicts the L-MSCKF algorithm’s general workflow. The low light environment adaptation mechanism of the L-MSCKF algorithm is highlighted in blue in the flow chart; meanwhile, black denotes the parts of the traditional MSCKF algorithm. The inertial measurement data and images are the system inputs. Upon receiving a new measurement, it first checks if the initialization is complete. If not, the initialization process is executed; if the initialization is complete, then the bias correction of the gyroscope is performed and the binocular images are enhanced. The system then moves on to the extraction and tracking of visual features, the augmentation of the camera state and covariance, and the propagation of the IMU state and covariance. Then, the MSCKF measurement process is executed, followed by the sliding window state management. With the goal to accomplish robot positioning, the posture estimate values of the robot are finally output in real time.

3.1. Image Processing

A technique for improving images that is based on the frequency domain is homomorphic filtering. By increasing contrast and shrinking the brightness range, the picture quality is enhanced by the application of the illumination–reflection model. By multiplying the illumination intensity

i (x, y)

by the reflection intensity

r (x, y)

, the following is how the approach depicts the image

f (x, y)

f (x, y) = i (x, y) \times r (x, y),

(1)

where

f (x, y)

denotes the resulting image.

i (x, y)

denotes the illumination component, which indicates the illumination part and corresponds to the slow-changing low-frequency component.

r (x, y)

represents the reflection part, which belongs to the property of the object and is a high-frequency part that changes rapidly. This combination causes the two components to be unable to separate in the frequency domain. Homomorphic filtering is primarily utilized to mitigate additive noise that is independent of the image signal level. However, it is not designed to directly address multiplicative noise, which is inherently related to the signal itself. It is necessary to utilize a non-linear logarithmic function to convert it into additive noise:

ln f (x, y) = ln i (x, y) + ln r (x, y),

(2)

The discrete cosine transform is performed on both sides of (2) to obtain the frequency domain expression. The traditional homomorphic filtering is based on Fourier transform and inverse transform. However, the Fourier transform has complex parameters, which leads to a complicated operation procedure and a large amount of data to handle. Additionally, its poor real-time performance makes it unsuitable for online detection in industrial processes. Therefore, this study uses discrete cosine transform (DCT), which is a special form of Discrete Fourier Transform (DFT) [35]. The special point is that the original transform signal is a real, even function.

The parameters in the DCT transform are all real numbers, and the positive and negative transform kernel functions of the DCT transform are the same. The calculation process is simple, significantly reducing the computational load, improving the operational speed, and meeting the storage space requirements in the SLAM system:

F (u, v) = I (u, v) + R (u, v),

(3)

The frequency domain is applied to the picture. The image’s texture and details are represented by the high-frequency portion, while its contours are represented by the low-frequency portion. The transfer function

H (u, v)

processes the output of (3):

H (u, v) F (u, v) = H (u, v) I (u, v) + H (u, v) R (u, v),

(4)

where

H (u, v)

increases the detail in the characteristics by strengthening the reflection component and decreasing the light component in the frequency domain. In this work, the image is processed using the Gaussian high-pass filter subsequent to the DCT transform. To change the picture from the frequency domain toward the spatial domain, (4) is subjected to the inverse DCT transform:

h_{f} (x, y) = h_{i} (x, y) + h_{r} (x, y),

(5)

Then, it is converted into an image by exponential transformation to restore the gray range of the image:

g (x, y) = exp [h_{f} (x, y)] + exp [h_{i} (x, y)] exp [h_{r} (x, y)],

(6)

Finally, the image after homomorphic filtering is obtained and processed using the CLAHE algorithm. The principle of CLAHE is already well documented in the existing literature [36], and thus, it will not be elaborated upon here.

3.2. IMU Bias Correction Model

Due to the inherent bias present in IMU devices, if not corrected promptly, the error will accumulate over time, ultimately leading to inaccurate or even divergent positioning outcomes. Consequently, it is essential to correct this cumulative error. The IMU exhibits varying dynamic response capabilities across different frequency bands. Leveraging the frequency complementarity of gyroscopes and accelerometers, we can integrate their performance strengths using a complementary filter algorithm. This approach combines the output information from different frequency ranges, yielding characterization results that more closely align with the actual values.

The state variable x of the online bias correction model for IMU gyroscope is composed of the JPL quaternion [37], which is a specialized quaternion representation using a left-hand coordinate system and the product of the imaginary units satisfying

i j k = 1

, and the bias

w_{b}

of the gyroscope. The quaternion representing rotation is the unit quaternion:

\begin{matrix} | q | = \sqrt{q_{1}^{2} + q_{2}^{2} + q_{3}^{2} + q_{4}^{2}} = 1, \end{matrix}

(7)

\begin{matrix} x = {[\begin{matrix} {}_{G}^{I}q & w_{b} \end{matrix}]}^{T}, \end{matrix}

(8)

Quaternions are chosen to represent the state vector because they provide a more flexible representation of the angular position than Euler angles. Moreover, quaternions do not encounter the problem of angular singularity. The quaternion consists of a scalar

q_{4}

and a vector

{[\begin{matrix} q_{1}, q_{2}, q_{3} \end{matrix}]}^{T}

. In (8),

{}_{G}^{I}q

denotes the rotational quaternion from the world coordinate system

\{\begin{matrix} G \end{matrix}\}

to the inertial navigation coordinate system

\{\begin{matrix} I \end{matrix}\}

. The differential equation in the continuous time state is

{}_{G}^{I}{\dot{q}} = \frac{1}{2} Ω {(w - w_{b})}_{G}^{I} q,

(9)

where w represents the IMU angular velocity measurement value. A function called

Ω (•)

converts a vector into an anti-symmetric matrix:

\{\begin{matrix} Ω (w) = [\begin{matrix} - ⌊ w \times ⌋ & w \\ - w^{T} & 0 \end{matrix}] \\ ⌊ w \times ⌋ = [\begin{matrix} 0 & - w_{z} & w_{y} \\ w_{z} & 0 & - w_{x} \\ - w_{y} & w_{x} & 0 \end{matrix}] \end{matrix}

(10)

The discrepancy between the real quantity

\hat{g}

and the gravity measurement result

z_{k}

is known as the residual e. In this case, the accelerometer provides a measurement value, while the value of the gravity acceleration g in the IMU coordinate framework represents the true value:

\{\begin{matrix} e = z_{k} - \hat{g} = z_{k} - C ({}_{G}^{I}q) g \\ C ({}_{G}^{I}q) = [\begin{matrix} 2 q_{4}^{2} - 1 + 2 q_{1}^{2} & 2 q_{4} q_{3} + 2 q_{1} q_{2} & 2 q_{1} q_{3} - 2 q_{4} q_{2} \\ 2 q^{2} q_{1} - 2 q_{4} q_{3} & 2 q_{4}^{2} - 1 + 2 q_{2}^{2} & 2 q_{4} q_{1} + 2 q_{2} q_{3} \\ 2 q_{4} q_{2} + 2 q_{3} q_{1} & 2 q_{3} q_{2} - 2 q_{4} q_{1} & 2 q_{4}^{2} - 1 + 2 q_{3}^{2} \end{matrix}] \end{matrix}

(11)

The matrix C in (11) is the rotation matrix corresponding to the quaternion

{}_{G}^{I}q

. The following is the solution for the Jacobian matrix

H_{k}^{b_{g}}

for the measurement equation and the Jacobian matrix

G_{k}^{b_{g}}

for the state transition equation:

\{\begin{matrix} G_{k}^{b_{g}} = [\begin{matrix} I_{4 \times 4} + \frac{1}{2} Δ t Ω (w - w_{b}) & - \frac{1}{2} Δ t L (q) \\ 0_{3 \times 4} & I_{3 \times 3} \end{matrix}] \\ H_{k}^{b_{g}} = [\begin{matrix} 2 q_{3} & - 2 q_{4} & 2 q_{1} & - 2 q_{2} & 0 & 0 & 0 \\ 2 q_{4} & 2 q_{3} & 2 q_{2} & 2 q_{1} & 0 & 0 & 0 \\ - 2 q_{1} & - 2 q_{2} & 2 q_{3} & 2 q_{4} & 0 & 0 & 0 \end{matrix}] \end{matrix}

(12)

where

Δ t

denotes the time interval between each execution of the algorithm. The

L (q)

in (12) is given by

L (q) = [\begin{matrix} - q_{4} & q_{3} & - q_{2} \\ - q_{3} & - q_{4} & q_{1} \\ q_{2} & - q_{1} & - q_{4} \\ q_{1} & q_{2} & q_{3} \end{matrix}]

(13)

Based on this, the update formula of the bias model is as follows:

\{\begin{matrix} K_{k}^{b_{g}} = \frac{P_{k | k - 1}^{b_{g}} H_{k}^{b_{g} T}}{H_{k}^{b_{g}} P_{k | k - 1}^{b_{g}} H_{k}^{b_{g} T} + R_{k}^{b_{g}}} \\ x_{k | k} = x_{k | k - 1} + K_{k}^{b_{g}} e \\ P_{k}^{b_{g}} = (I - K_{k}^{b_{g}} H_{k}^{b_{g}}) P_{k - 1}^{b_{g}} \end{matrix}

(14)

where

R_{k}^{b_{g}}

denotes the measurement noise covariance, and

P_{k}^{b_{g}}

denotes the estimated covariance matrix at time k. Through gravity vector measurement, only the roll and pitch angles can be corrected. To ensure that the yaw angle is not affected by the correction,

q_{3}

is set to zero after the state quantity is updated. Finally, the modified coefficient

w_{b}

is obtained, which will be used for the subsequent MSCKF calculation.

4. Results

4.1. Analysis of Image Enhancement Effect

Since the homomorphic filtering and CLAHE image enhancement algorithms both possess critical adjustable parameters that significantly influence the experimental results, it is necessary to determine the optimal parameter combination for these algorithms. Consequently, experiments are carried out by utilizing the EuRoC [38] public dataset to confirm the efficacy about the means described in this study. The data collection is accomplished by the AscTec Firefly six-rotor UAV (Intel, Witten, Germany), which was equipped with a camera, model MT9V034 (ON Semiconductor, Scottsdale, AZ, USA), characterized by a 1/3-inch sensor size and 6.0 × 6.0 micrometer pixel size, and endowed with a global shutter function. This camera has an image size of 752 × 480 pixels and runs at a sample rate of 20 Hz. The camera and IMU are synchronized by hardware. The data obtained by the camera and IMU include the binocular image information of the camera and the inertial information of the IMU, as well as the installation orientation of the sensor and the attitude and position information obtained by the external device. The real trajectory data of the UAV are obtained by the VICON0 (Oxford Metrics, Oxford, UK) reflection mark and a laser tracking sensor LEICA0 (Leica Geosystems, Heerbrugg, Switzerland) that outputs at a frame rate of 20 Hz. VICON is a technology based on the Viken motion capture system. The system has a set of reflection marks on the UAV, which can obtain the position and attitude information with a millimeter-level accuracy at a frame rate of 100 Hz. The dataset contains a total of 11 sequences of two scenes of the computer room and the ordinary room. According to the flight status and environmental conditions of the UAV, for each scene, three levels of difficulty are assigned to the dataset: simple, medium, and difficult. As the difficulty level increases, the UAV flight speed will become faster, and the ambient light changes more drastically.

The experimental hardware platform utilizes an 11th Gen Intel(R) Core(TM) i9-11900K CPU with a main frequency of 3.50 GHz, an NVIDIA RTX A4000 graphics card, and 64.0 GB of memory; the algorithms proposed herein are implemented in C++ and operate on Ubuntu 20.04 with the ROS Noetic version. Using the MH02 dataset as an example, a controlled variable method is employed to study the impact of each parameter on the enhancement of binocular images, thereby identifying the optimal settings. Figure 2 shows the outcomes of the experiments.

In Figure 2,

r H

is the high-frequency gain; the greater its value, the brighter the image.

r L

is the low-frequency gain, which controls the low-frequency components in the image. Generally,

r H > 1

and

r L < 1

. The sharpening coefficient c, which controls the steepness of the filter function’s slope, is a constant that takes a value between the high-frequency gain and the low-frequency gain. The contrast enhancement threshold

c l i p l i m i t

of CLAHE depends on the normalization of the histogram and the size of the neighborhood space, and is generally set to a positive value. First, the high-frequency gain parameter

r H

is studied with the low-frequency gain

r L

fixed at 0.5, the sharpening coefficient c set to 1, and the CLAHE contrast enhancement threshold

c l i p l i m i t

set to 4. The results are shown in Figure 2a. Similarly, while keeping other parameters constant, the low-frequency gain, sharpening coefficient, and contrast threshold are studied individually, with the results presented in Figure 2b–d. The figure indicates that when the high-frequency gain is between 1.4 and 1.8, the low-frequency gain is between 0.2 and 0.8, the sharpening coefficient is between 1.5 and 1.8, and the contrast threshold is between 3 and 5, the algorithm’s root mean square error is relatively low. Consequently, the high-frequency gain is selected to be 1.6, the low-frequency gain is 0.3, the sharpening coefficient is 1.5, and the CLAHE contrast threshold

c l i p l i m i t

is chosen as 4. Based on these parameter settings, the image enhancement effect is analyzed, with the results depicted in Figure 3.

After image improvement, the amount of feature points retrieved from every picture is shown in Figure 3 in an understandable manner. Here, two groups of images of dark scenes and bright scenes are selected for comparison. In this figure, the first column is the number of feature points that can be obtained from the original image. The second and third columns represent the image feature point extraction result after only CLAHE processing and only homomorphic filtering processing. The last column shows the feature point extraction result after both CLAHE processing and homomorphic filtering processing. It is evident that, particularly in the dimly lit area, very few feature points are retained from the initial image. Even in low light, more feature points can be obtained after using the combined visual processing approach in this work, demonstrating the potency of the image enhancement technique employed.

Table 1 presents the quantitative representation of FAST feature point extraction in each scene in Figure 3a. The table shows that while there are at least 10 times as many original picture feature points produced in the normal light scenario as there are in the poor light situation, there are only ten digits obtained in the weak light scene. The amount of retrieved feature points rises simply with CLAHE processing and falls with homomorphic filtering processing. Considering that homomorphic filtering is an operation in the frequency domain of the image, it may cause some high-frequency details to be suppressed or lost. FAST is a feature extraction algorithm based on corner detection, and it mainly focuses on the corner features in the picture. Corner points are normally areas with obvious texture changes in the image, and homomorphic filtering may blur or weaken these texture changes, causing the FAST algorithm to fail to correctly detect enough corner points. Therefore, following homomorphic filtering, fewer extracted feature points are obtained. The quantity of characteristic points extracted after using the combined processing method is larger than that using only CLAHE processing and only homomorphic filtering processing. In scene 1 with relatively normal light, the quantity of characteristic points obtained after the combined processing is 4.9 times that of the original image, while in scene 4 with darker light, the quantity of characteristic points obtained is 5.5 times that of the original image.

Evaluation measures such the Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM) are used to measure the enhancement effects with the goal to further confirm the efficacy of the fusion technique for picture improvement. MSE is an evaluation metric based on image pixel statistics, which assesses the quality of the distorted image by measuring the mean square value of pixel differences between the reference image and the distorted image. The PSNR characterizes the fidelity of the image (the smaller the image distortion, the higher the value). Structural similarity focuses on comparing the structural differences between the reference image and the distorted image. A larger SSIM value indicates that the distorted image is structurally similar to the reference image, suggesting better image quality [39]. In the experiment, the standard image quality assessment dataset KADID-10k [40] is used, selecting the I09 and I61 images as reference images. The darker images I09_17_05 and I61_17_05 are chosen as the original images and subjected to image enhancement. Table 2 displays the findings from the three metrics used to evaluate the quality of the images.

The experiment relies on a comprehensive reference dataset, which enables the use of full-reference quality evaluation metrics, such as PSNR and SSIM, to assess the changes in images before and after enhancement processing. The results in Table 2 indicate that among all evaluation metrics, the combine algorithm demonstrates particularly outstanding performance. Especially regarding the image error metric MSE, the combine algorithm shows significant improvement compared to homomorphic filtering and CLAHE. The MSE of the images enhanced by the combine algorithm is reduced by an average of 48.685% compared to the initial images. Regarding PSNR, the outcomes of the combine algorithm are the most outstanding, indicating that it can maintain high pixel-level accuracy in the enhanced images. In addition, the results of the SSIM further confirm the performance of the combine algorithm in preserving the structural information of the images.

4.2. IMU Bias Correction Result

Figure 4 displays the fluctuation trends of the estimated gyroscope bias coefficients after the IMU bias correction by the MSCKF-VIO algorithm on the MH02 and V203 sequences. In Figure 4, the bias estimates for the three coordinate axes of the L-MSCKF and MSCKF-VIO algorithms are compared, with L-MSCKF represented in red and MSCKF-VIO in blue. It is evident from the figure that the traditional MSCKF-VIO algorithm exhibits significant fluctuations in the estimated gyroscope biases during testing on these sequences. Such fluctuations not only affect the stability of pose estimation but could also potentially degrade the accuracy of the overall VIO pose estimation. However, the L-MSCKF algorithm effectively mitigates these disturbances by incorporating acceleration measurements to update the gyroscope bias coefficients, thus enhancing the precision of VIO pose estimation.

4.3. Comprehensive Evaluation of Algorithm Performance

4.3.1. Verification of Pose Estimation Results

For a better comparative analysis, this section conducts simulation experiments on different visual inertial mileage calculation methods, including optimized VINS-mono, filter-based MSCKF-VIO, and L-MSCKF. The operation of each algorithm on V203 can be seen at https://github.com/quickstates/l-msckf (accessed on 15 October 2024).

Figure 5 illustrates the trajectory comparison between the improved algorithm and the original algorithm MSCKF-VIO and VINS-mono in difficult scenes V103 and V203. The actual drone flight path supplied by each data packet is shown by the black dotted line in Figure 5. The red line, the blue line, and the green solid line indicate the trajectories obtained by L-MSCKF, MSCKF-VIO, and VINS-mono, respectively. The three algorithms’ respective trajectories on the sequences V103 and V203 are displayed in Figure 5a,c. To present the trajectory deviation of the algorithm more intuitively, the trajectory comparison between the algorithms on the three axes is shown in Figure 5b,d, respectively. In the V103 sequence, the trajectory of the L-MSCKF algorithm closely aligns with the true trajectory, demonstrating good consistency. Particularly at the curved sections of the trajectory, while the VINS-mono and MSCKF-VIO algorithms exhibit some deviation, the L-MSCKF algorithm maintains proximity to the true values. In the V203 sequence, the L-MSCKF algorithm shows a more significant accuracy advantage compared to the MSCKF-VIO and VINS-mono algorithms. Even in areas where MSCKF-VIO deviates significantly from the true values, the trajectory of the L-MSCKF algorithm remains with a small margin of error. Figure 5b,d demonstrate that the L-MSCKF algorithm exhibits relatively stable performance across all three axes, deviating only minimally from the true trajectory. In contrast, the MSCKF-VIO and VINS-mono algorithms exhibit larger deviations, particularly at the beginning and end of the path and particularly on the Y and Z axes.

To statistically evaluate the consistency of the trajectories obtained by different algorithms, the Absolute Pose Error (APE) is introduced for evaluation. One of the key metrics for assessing the accuracy of SLAM positioning is APE. It describes the discrepancy among the camera pose’s actual value and its compute value, APE encompasses both rotational and translational errors. Within this study, we utilize the root mean square error (RMSE), mean error, median error, and Standard Deviation (STD) to characterize the relative error. The statistical outcomes are presented in Table 3 and Table 4. Furthermore, to evaluate the algorithms’ performance under low-light conditions, the V203 sequence, which has the highest number of low-light scenes, is selected for comparison. The errors of each algorithm on this sequence are compared as illustrated in Figure 6.

Table 3 and Table 4 respectively present the absolute trajectory error and rotation error of the algorithms when run on the same dataset. Data from the V1, V2, and MH sequences are selected to represent three levels of difficulty: the simple V101 and V201 sequences, the medium-difficulty V102, V202, and MH03 sequences, and the difficult V103, V203, and MH04 sequences. The L-MSCKF method exhibits the best result as shown by Table 3, with an average RMSE of 0.152 m, which is lower than that of both VINS-mono (average RMSE 0.208 m) and MSCKF-VIO (average RMSE 0.349 m). Compared to the original MSCKF-VIO algorithm, the RMSE is reduced by up to 85.9%. This highlights the superiority of the L-MSCKF under low-light conditions, particularly in the MH03 and MH04 sequences, which feature numerous low-light scenes and complex rotations. By enhancing the original images and correcting the biases, the original algorithm’s feature point extraction and matching are greatly enhanced by the L-MSCKF approach, thereby greatly enhancing the accuracy of trajectory positioning. Compared with the MSCKF-VIO algorithm, the RMSE is reduced by 24.9% and 20.4%, respectively.

In Table 4, which pertains to the assessment of rotational error, the L-MSCKF algorithm demonstrates a marginally superior level of precision compared to MSCKF-VIO, with this advantage being particularly pronounced in contrast with VINS-mono. By comprehensively analyzing the data in Table 3 and Table 4, we can conclude that the primary advantage of L-MSCKF lies in the precision of the trajectory. Figure 6 further illustrates that the trajectory errors of MSCKF-VIO and VINS-mono begin to increase significantly in the timeframes with a higher prevalence of dark scenes, whereas L-MSCKF consistently maintains a lower error range. L-MSCKF benefits from its image enhancement and gyroscope bias correction technologies, enabling it to obtain more accurate data at the initial stage. Therefore, the error during the initial time period is significantly lower than that of MSCKF-VIO. Subsequently, under poor illumination conditions, the image enhancement of L-MSCKF comes into play, ensuring that a satisfactory count of feature points can still be extracted from the camera data, thereby maintaining positioning accuracy.

4.3.2. Algorithm Efficiency Analysis

Figure 7 presents a comparison of the computing resources and duration of the L-MSCKF, MSCKF-VIO, and VINS-mono algorithms based on the dataset in Table 3. Figure 7a illustrates the average CPU utilization of each algorithm as a percentage of the total available CPU, while Figure 7b depicts their total running times. In the figures, green represents VINS-mono, blue represents MSCKF-VIO, and red represents L-MSCKF. The MSCKF-VIO algorithm, being relatively simple, requires the least CPU resources and time. Conversely, the improved L-MSCKF algorithm, which includes additional enhancement and correction processes for both camera and IMU data, incurs increased CPU resource demands and processing time, particularly in terms of CPU consumption. Nevertheless, it still utilizes fewer resources compared to the VINS-mono algorithm that employs non-linear optimization. The Extended Kalman Filter structure serves as the foundation for both the L-MSCKF and MSCKF-VIO algorithms. During computation, they limit the number of historical camera states involved in error construction by constraining the sliding window’s length. Furthermore, techniques such as QR decomposition are employed to further reduce the dimensionality of the observation matrix.

5. Discussion

As shown in Figure 6 that when the illumination environment weakens, the errors of MSCKF-VIO and VINS-mono suddenly increase, while L-MSCKF effectively suppresses and mitigates this part of the error. L-MSCKF supports image matching and pose estimation by optimizing the front-end raw data to provide accurate initial values as depicted in Figure 3 and Figure 5. Furthermore, the refinements of L-MSCKF are primarily concentrated on the translation aspects as shown in Table 3 and Table 4. By comparing the errors of L-MSCKF with those of MSCKF-VIO and VINS-mono on the same dataset, it is evident that L-MSCKF can effectively mitigate the effects of low illumination, thereby achieving high-precision positioning for the robot.

In this study, we introduce a localization technique for robots that enhances the quality of insufficient raw data through multi-sensor fusion, enabling adaptation to low-light environments. This method demonstrates excellent performance under low-light conditions, enhancing the overall stability of the robot’s positioning. However, it still encounters challenges in indoor environments, particularly in scenes characterized by uniform weak texture features. Therefore, future work will require the integration of additional sensors to address these issues, thereby further enhancing the positioning accuracy and broadening the practicality of the approach.

6. Conclusions

Addressing the challenge of localization in dimly lit indoor environments, this paper presents an L-MSCKF algorithm that incorporates data correction. The main content includes the following:

(1): A module for improving photographs is developed to lessen the negative impact that poor quality photos have on the accuracy of stereo visual odometry. Validated through testing on public datasets, this module shows a notable rise in feature point count, an improvement in picture quality, and an improvement in feature matching accuracy.
(2): To address the issue of IMU bias instability, a complementary Kalman filter is introduced. This filter utilizes the acceleration data of the IMU to compensate for the gyroscope bias. According to experimental findings, this method may successfully reduce IMU bias and improve bias coefficient stability, which would increase the localization algorithm’s accuracy.
(3): The L-MSCKF algorithm is tested on the EuRoC dataset, outperforming two other methods. Test outcomes indicate that the L-MSCKF algorithm achieves an average RMSE of 0.152 m, outperforming the original MSCKF-VIO algorithm and exhibiting enhanced accuracy.

In general, this study analyzes the positioning problem of visual inertial odometry in indoor environments with insufficient lighting. In the future, we intend to integrate advanced sensor technologies such as Ultra-Wideband (UWB) and LiDAR into robotic systems to address complex illumination changes and weak texture features. By leveraging accurate distance measurement and efficient feature capture, we aim to enhance real-time localization capabilities in challenging environments such as underground areas and tunnels.

Author Contributions

Conceptualization, M.W. and Z.L.; methodology, M.W.; software, Y.T.; validation, M.W., Y.T. and P.W.; formal analysis, M.W.; investigation, Y.T., P.W. and M.W.; resources, Z.L.; data curation, Z.Y.; writing—original draft preparation, M.W.; writing—review and editing, M.A.N.-A.; visualization, M.W.; supervision, Z.L. and M.A.N.-A.; project administration, L.G.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fundamental Research Funds for the Universities of Henan Province (Grant number NSFRF230405), the Doctoral Scientific Fund Project of Henan Polytechnic University (Grant number B2017-10), Henan Polytechnic University Funding Plan for Young Backbone Teachers (Grant number 2022XQG-08), the Henan Province Science and Technology Research Projects (Grant number 242102320070), the National Natural Science Foundation of China (Grant number 42374029).

Data Availability Statement

The data presented in this study are available on request at https://projects.asl.ethz.ch/datasets/doku.php?id=kmavvisualinertialdatasets (accessed on 1 September 2016).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, X.; Song, B.; Shen, Z.; Zhou, Y.; Lyu, H.; Qin, Z. Consistent localization for autonomous robots with inter-vehicle GNSS information fusion. IEEE Commun. Lett. 2022, 27, 120–124. [Google Scholar] [CrossRef]
Benachenhou, K.; Bencheikh, M.L. Detection of global positioning system spoofing using fusion of signal quality monitoring metrics. Comput. Electr. Eng. 2021, 92, 107159. [Google Scholar] [CrossRef]
Tian, Y.; Lian, Z.; Núñez-Andrés, M.A.; Yue, Z.; Li, K.; Wang, P.; Wang, M. The application of gated recurrent unit algorithm with fused attention mechanism in UWB indoor localization. Measurement 2024, 234, 114835. [Google Scholar] [CrossRef]
Gao, X.; Lin, X.; Lin, F.; Huang, H. Segmentation Point Simultaneous Localization and Mapping: A Stereo Vision Simultaneous Localization and Mapping Method for Unmanned Surface Vehicles in Nearshore Environments. Electronics 2024, 13, 3106. [Google Scholar] [CrossRef]
Sun, T.; Liu, Y.; Wang, Y.; Xiao, Z. An improved monocular visual-inertial navigation system. IEEE Sens. J. 2020, 21, 11728–11739. [Google Scholar] [CrossRef]
Zhang, J.; Xu, L.; Bao, C. An Adaptive Pose Fusion Method for Indoor Map Construction. ISPRS Int. J. Geo-Inf. 2021, 10, 800. [Google Scholar] [CrossRef]
Zhai, G.; Zhang, W.; Hu, W.; Ji, Z. Coal mine rescue robots based on binocular vision: A review of the state of the art. IEEE Access 2020, 8, 130561–130575. [Google Scholar] [CrossRef]
Wang, H.; Li, Z.; Wang, H.; Cao, W.; Zhang, F.; Wang, Y. A Roadheader Positioning Method Based on Multi-Sensor Fusion. Electronics 2023, 12, 4556. [Google Scholar] [CrossRef]
Cheng, J.; Li, H.; Ma, K.; Liu, B.; Sun, D.; Ma, Y.; Yin, G.; Wang, G.; Li, H. Architecture and Key Technologies of Coalmine Underground Vision Computing. Coal Sci. Technol. 2023, 51, 202–218. [Google Scholar]
Dai, X.; Mao, Y.; Huang, T.; Li, B.; Huang, D. Navigation of simultaneous localization and mapping by fusing RGB-D camera and IMU on UAV. In Proceedings of the 2019 CAA Symposium on Fault Detection, Supervision and Safety for Technical Processes (SAFEPROCESS), Xiamen, China, 5–7 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 6–11. [Google Scholar]
Sun, K.; Mohta, K.; Pfrommer, B.; Watterson, M.; Liu, S.; Mulgaonkar, Y.; Taylor, C.J.; Kumar, V. Robust stereo visual inertial odometry for fast autonomous flight. IEEE Robot. Autom. Lett. 2018, 3, 965–972. [Google Scholar] [CrossRef]
Syed, Z.F.; Aggarwal, P.; Goodall, C.; Niu, X.; El-Sheimy, N. A new multi-position calibration method for MEMS inertial navigation systems. Meas. Sci. Technol. 2007, 18, 1897. [Google Scholar] [CrossRef]
Liu, J.; Sun, L.; Pu, J.; Yan, Y. Hybrid cooperative localization based on robot-sensor networks. Signal Process. 2021, 188, 108242. [Google Scholar] [CrossRef]
Cen, R.; Jiang, T.; Tan, Y.; Su, X.; Xue, F. A low-cost visual inertial odometry for mobile vehicle based on double stage Kalman filter. Signal Process. 2022, 197, 108537. [Google Scholar] [CrossRef]
Wang, H.; Zhang, Y.; Shen, H.; Zhang, J. Review of image enhancement algorithms. Chin. Opt. 2017, 10, 438–448. [Google Scholar] [CrossRef]
Huang, S.C.; Cheng, F.C.; Chiu, Y.S. Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE Trans. Image Process. 2012, 22, 1032–1041. [Google Scholar] [CrossRef]
Wang, F.; Zhang, B.; Zhang, C.; Yan, W.; Zhao, Z.; Wang, M. Low-light image joint enhancement optimization algorithm based on frame accumulation and multi-scale Retinex. Ad Hoc Netw. 2021, 113, 102398. [Google Scholar] [CrossRef]
Dong, S.; Ma, J.; Su, Z.; Li, C. Robust circular marker localization under non-uniform illuminations based on homomorphic filtering. Measurement 2021, 170, 108700. [Google Scholar] [CrossRef]
Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vision Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
Baek, J.; Kim, Y.; Chung, B.; Yim, C. Linear Spectral Clustering with Contrast-limited Adaptive Histogram Equalization for Superpixel Segmentation. IEIE Trans. Smart Process. Comput. 2019, 8, 255–264. [Google Scholar] [CrossRef]
Çiğ, H.; Güllüoğlu, M.T.; Er, M.B.; Kuran, U.; Kuran, E.C. Enhanced Disease Detection Using Contrast Limited Adaptive Histogram Equalization and Multi-Objective Cuckoo Search in Deep Learning. Trait. Signal 2023, 40, 915. [Google Scholar] [CrossRef]
Aboshosha, S.; Zahran, O.; Dessouky, M.I.; Abd El-Samie, F.E. Resolution and quality enhancement of images using interpolation and contrast limited adaptive histogram equalization. Multimed. Tools Appl. 2019, 78, 18751–18786. [Google Scholar] [CrossRef]
Yoon, J.; Choi, J.; Choe, Y. Efficient image enhancement using sparse source separation in the Retinex theory. Opt. Eng. 2017, 56, 113103. [Google Scholar] [CrossRef]
Cheng, J.; Yan, P.; Yu, H.; Shi, M.; Xiao, H. Image stitching method for the complicated scene of coalmine tunnel based on mismatched elimination with directed line segments. Coal Sci. Technol. 2022, 50, 179–191. [Google Scholar]
Gong, Y.; Xie, X. Research on coal mine underground image recognition technology based on homomorphic filtering method. Coal Sci. Technol. 2023, 51, 241–250. [Google Scholar]
Tu, Z.; Chen, C.; Pan, X.; Liu, R.; Cui, J.; Mao, J. Ema-vio: Deep visual–inertial odometry with external memory attention. IEEE Sens. J. 2022, 22, 20877–20885. [Google Scholar] [CrossRef]
Zhou, X.; Wen, X.; Wang, Z.; Gao, Y.; Li, H.; Wang, Q.; Yang, T.; Lu, H.; Cao, Y.; Xu, C.; et al. Swarm of micro flying robots in the wild. Sci. Robot. 2022, 7, eabm5954. [Google Scholar] [CrossRef]
Forster, C.; Carlone, L.; Dellaert, F.; Scaramuzza, D. On-manifold preintegration for real-time visual–inertial odometry. IEEE Trans. Robot. 2016, 33, 1–21. [Google Scholar] [CrossRef]
Qin, T.; Li, P.; Shen, S. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [Google Scholar] [CrossRef]
Mourikis, A.I.; Roumeliotis, S.I. A multi-state constraint Kalman filter for vision-aided inertial navigation. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, Rome, Italy, 10–14 April 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 3565–3572. [Google Scholar]
Dissanayake, G.; Sukkarieh, S.; Nebot, E.; Durrant-Whyte, H. The aiding of a low-cost strapdown inertial measurement unit using vehicle model constraints for land vehicle applications. IEEE Trans. Robot. Autom. 2001, 17, 731–747. [Google Scholar] [CrossRef]
Tong, X.; Su, Y.; Li, Z.; Si, C.; Han, G.; Ning, J.; Yang, F. A double-step unscented Kalman filter and HMM-based zero-velocity update for pedestrian dead reckoning using MEMS sensors. IEEE Trans. Ind. Electron. 2019, 67, 581–591. [Google Scholar] [CrossRef]
Chen, S.; Li, X.; Huang, G.; Zhang, Q.; Wang, S. NHC-LIO: A Novel Vehicle Lidar-inertial Odometry (LIO) with Reliable Non-holonomic Constraint (NHC) Factor. IEEE Sens. J. 2023, 23, 26513–26523. [Google Scholar] [CrossRef]
Sun, R.; Yang, Y.; Chiang, K.W.; Duong, T.T.; Lin, K.Y.; Tsai, G.J. Robust IMU/GPS/VO integration for vehicle navigation in GNSS degraded urban areas. IEEE Sens. J. 2020, 20, 10110–10122. [Google Scholar] [CrossRef]
He, X. Research About Image Tampering Detection Based On Processing Traces–Blur Traces Detection. Master’s Thesis, Beijing Jiaotong University, Beijing, China, 2012. [Google Scholar]
He, Z.; Mo, H.; Xiao, Y.; Cui, G.; Wang, P.; Jia, L. Multi-scale fusion for image enhancement in shield tunneling: A combined MSRCR and CLAHE approach. Meas. Sci. Technol. 2024, 35, 056112. [Google Scholar] [CrossRef]
Trawny, N.; Roumeliotis, S.I. Indirect Kalman Filter for 3D Attitude Estimation; Technical Report; University of Minnesota, Department of Computer Science&Engineering: Minneapolis, MN, USA, 2005; Volume 2. [Google Scholar]
Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M.W.; Siegwart, R. The EuRoC micro aerial vehicle datasets. Int. J. Robot. Res. 2016, 35, 1157–1163. [Google Scholar] [CrossRef]
Xue, C. Research on Image Quality Evaluation Methods Based on Visual Perception and Feature Fusion. Master’s Thesis, Xi’an University of Technology, Xi’an, China, 2024. [Google Scholar]
Lin, H.; Hosu, V.; Saupe, D. KADID-10k: A large-scale artificially distorted IQA database. In Proceedings of the 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), Berlin, Germany, 5–7 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–3. [Google Scholar]

Figure 1. Algorithm procedure.

Figure 2. Selection of image enhancement algorithm parameters. (a) Fixed low-frequency gain at 0.5, sharpening coefficient at 1, and contrast threshold set to 4. (b) Fixed high-frequency gain at 1.6, sharpening coefficient of 1, and contrast threshold of 4. (c) High-frequency gain is set to 1.6, low-frequency gain to 0.3, and the contrast threshold to 4. (d) The fixed high-frequency gain is 1.6, the low-frequency gain is 0.3, and the sharpening coefficient is 1.5.

Figure 3. Comparison of feature point extraction effect. (a) The feature point extraction result of the original image. (b) The feature point extraction result after CLAHE processing. (c) The feature point extraction result after homomorphic filtering processing. (d) The feature point extraction result after both CLAHE processing and homomorphic filtering processing.

Figure 4. Estimation of gyroscope bias coefficients on the MH02 and V203 sequences. (a) Variation in gyroscope bias for L-MSCKF and MSCKF-VIO on the MH02 sequence. (b) Estimated gyroscope bias values by L-MSCKF and MSCKF-VIO on the V203 sequence.

Figure 5. The trajectory of the algorithm on sequences V103 and V203 of the EuRoC dataset. (a) The trajectory on the V103 sequence. (b) The X, Y, and Z triaxial values on the V103 sequence. (c) The trajectory on the V203 sequence. (d) The X, Y, and Z triaxial values on the V203 sequence.

Figure 6. Comparison of absolute trajectory errors of each algorithm on weak light sequence V203.

Figure 7. Comparison of the computational efficiency of each algorithm. (a) Average CPU usage in % of the total available CPU, by the algorithms running the same experiment. (b) Total running time of each algorithm on the same dataset.

Table 1. The number of feature points extracted by the FAST algorithm from each scene in Figure 3.

Scene	Original	CLAHE	Homomorphic Filtering	Combine
Scene 1	1761	5829	1625	8570
Scene 2	11	200	5	248
Scene 3	436	2092	453	8728
Scene 4	36	141	11	199

Table 2. Performance comparison of image enhancement algorithm.

Method	I09				I61
Method	Original	CLAHE	Homomorphic Filtering	Combine	Original	CLAHE	Homomorphic Filtering	Combine
SSIM	0.440	0.517	0.350	0.528	0.316	0.405	0.237	0.438
PSNR/dB	10.615	13.142	10.769	13.888	11.563	14.067	11.857	14.116
MSE	5643.83	3153.81	5447.59	2656.52	4536.71	2549.33	4240.16	2520.64

Table 3. Translation error of the algorithm on the EuRoC dataset.

Sequence	VINS-mono/m				MSCKF-VIO/m				L-MSCKF/m
Sequence	RMSE	Mean	Median	STD	RMSE	Mean	Median	STD	RMSE	Mean	Median	STD
V101	0.092	0.080	0.067	0.045	0.100	0.090	0.076	0.045	0.081	0.074	0.073	0.034
V102	0.188	0.145	0.121	0.120	0.128	0.114	0.106	0.058	0.111	0.098	0.088	0.053
V103	0.195	0.170	0.156	0.095	0.207	0.195	0.192	0.070	0.166	0.143	0.127	0.084
V201	0.096	0.084	0.078	0.047	0.072	0.063	0.052	0.035	0.061	0.054	0.050	0.029
V202	0.141	0.119	0.087	0.075	0.152	0.141	0.134	0.058	0.149	0.140	0.132	0.049
V203	0.373	0.335	0.294	0.163	1.778	1.683	1.492	0.573	0.249	0.228	0.205	0.100
MH02	0.193	0.174	0.164	0.085	0.184	0.160	0.134	0.092	0.148	0.123	0.102	0.084
MH03	0.218	0.190	0.166	0.107	0.217	0.200	0.173	0.084	0.163	0.151	0.150	0.063
MH04	0.373	0.355	0.400	0.116	0.299	0.272	0.255	0.123	0.238	0.221	0.212	0.088

Table 4. Rotation error of the algorithm on the EuRoC dataset.

Sequence	VINS-mono/rad				MSCKF-VIO/rad				L-MSCKF/rad
Sequence	RMSE	Mean	Median	STD	RMSE	Mean	Median	STD	RMSE	Mean	Median	STD
V101	0.109	0.108	0.104	0.011	0.097	0.095	0.096	0.016	0.095	0.094	0.095	0.014
V102	0.082	0.078	0.075	0.026	0.044	0.042	0.041	0.014	0.038	0.036	0.040	0.014
V103	0.078	0.074	0.078	0.025	0.088	0.082	0.074	0.033	0.083	0.075	0.088	0.035
V201	0.040	0.033	0.026	0.022	0.027	0.025	0.026	0.009	0.011	0.011	0.010	0.004
V202	0.042	0.040	0.035	0.014	0.037	0.034	0.034	0.014	0.027	0.026	0.025	0.007
V203	0.061	0.051	0.042	0.033	0.157	0.155	0.145	0.027	0.035	0.033	0.031	0.012
MH02	0.038	0.038	0.037	0.006	0.048	0.047	0.046	0.011	0.048	0.045	0.044	0.017
MH03	0.030	0.027	0.026	0.011	0.026	0.024	0.022	0.010	0.025	0.021	0.015	0.013
MH04	0.027	0.020	0.017	0.018	0.034	0.031	0.024	0.014	0.024	0.021	0.016	0.012

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, M.; Lian, Z.; Núñez-Andrés, M.A.; Wang, P.; Tian, Y.; Yue, Z.; Gu, L. Robot Localization Method Based on Multi-Sensor Fusion in Low-Light Environment. Electronics 2024, 13, 4346. https://doi.org/10.3390/electronics13224346

AMA Style

Wang M, Lian Z, Núñez-Andrés MA, Wang P, Tian Y, Yue Z, Gu L. Robot Localization Method Based on Multi-Sensor Fusion in Low-Light Environment. Electronics. 2024; 13(22):4346. https://doi.org/10.3390/electronics13224346

Chicago/Turabian Style

Wang, Mengqi, Zengzeng Lian, María Amparo Núñez-Andrés, Penghui Wang, Yalin Tian, Zhe Yue, and Lingxiao Gu. 2024. "Robot Localization Method Based on Multi-Sensor Fusion in Low-Light Environment" Electronics 13, no. 22: 4346. https://doi.org/10.3390/electronics13224346

APA Style

Wang, M., Lian, Z., Núñez-Andrés, M. A., Wang, P., Tian, Y., Yue, Z., & Gu, L. (2024). Robot Localization Method Based on Multi-Sensor Fusion in Low-Light Environment. Electronics, 13(22), 4346. https://doi.org/10.3390/electronics13224346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robot Localization Method Based on Multi-Sensor Fusion in Low-Light Environment

Abstract

1. Introduction

2. Related Work

2.1. Image Enhancement Algorithm

2.2. IMU Bias Correction Algorithm

3. Materials and Methods

3.1. Image Processing

3.2. IMU Bias Correction Model

4. Results

4.1. Analysis of Image Enhancement Effect

4.2. IMU Bias Correction Result

4.3. Comprehensive Evaluation of Algorithm Performance

4.3.1. Verification of Pose Estimation Results

4.3.2. Algorithm Efficiency Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI