1. Introduction
Sensing the surrounding environment is essential for a robot to perform its mission, from planning to control. It is common to see autonomous vehicles equipped with various, sometimes even redundant, sensors to perceive the environment, often relying on sensor fusion techniques [
1]. Similarly, industrial robots require multiple sensors to work on strongly circumstance-dependent tasks, such as object grasping, obstacle avoidance, and human–robot collaboration [
2]. These efforts show how crucial and difficult it is for a robot to understand the world, especially in unstructured environments, which have not been modified specifically to facilitate the execution of a task [
3].
Cameras are devices containing a sensor that captures visual data in the form of images or video, which can be analyzed and processed to extract useful information about the environment. In the market, cameras are available at different costs and designs presenting sensors, such as RGB, RGB-D, and thermal sensors [
4,
5]. Some of the advantages of using cameras as sensor devices on feedback systems are the flexibility of the data it represents, the possibility to extract real-time information, and it does not require contact to operate.
Visual servoing is a control strategy that uses visual feedback from cameras to control the motion of a robot or a system. It has been extensively studied and developed over the past few decades and applied in various applications in robotics and automation, such as industrial robots [
6], underwater vehicle manipulation [
7,
8], multiagent tasks [
9], and medical imaging [
10]. Visual servoing aims to use the visual information captured by the cameras to adjust a robot’s motion to achieve a desired task, such as positioning an object, following a path, or tracking a moving target. That visual information comes in the form of image features, which can be colored points, circle radius, bounding boxes, contours, or any image element of interest.
The visual servoing formulation relies on the mapping of image features and the camera pose, best described in terms of derivatives and an image jacobian matrix, also known as the interaction matrix, to relate them. However, this matrix requires knowledge of the intrinsic and extrinsic parameters of the camera, which are not easily obtained. Even when a camera calibration is made a priori, there are limitations to that approach, especially in a single-camera scenario and unstructured environments. Small physical disturbances to the device, non-linear lens distortion, and a tracking object moving out of the field of view are among the examples of situations requiring camera recalibration [
11,
12,
13].
Uncalibrated visual servoing (UVS) aims to estimate the image jacobian matrix from offline preset moves or through online interactions. It may start with an initial guess from a previous calibration process or skip that phase entirely with random initial values. This approach permits a vision robotic system adaptable to the camera and environment changes. Despite the advances in computer vision, a recent review of vision-based robotic applications stated that images have faults due to problems like picture noise [
14]. If that noise is not properly treated, it can lead to difficulties in the image Jacobian estimation convergence.
The examples shown previously, which would require camera recalibration, can be treated as noise in the image features tracking task. Moreover, other sources of noise can be changing light conditions and motion blur. As UVS adaptively estimates the interaction of features and camera pose, most algorithms can handle the uncertainty if the noise is small. There are techniques concerning UVS estimation in the presence of Gaussian noise that may work well under smooth and well-behaved disturbances, like camera vibration and motion blur. Nonetheless, impulsive and non-Gaussian noise occurs when, for example, a tracking object suddenly moves out of the field of view, or the light in the image is interfered momentarily by other light sources, producing reflections.
Generally, algorithms expecting non-Gaussian noise work well with Gaussian noise, but not the other way around [
15]. Some of the techniques to estimate non-Gaussian noise are particle filter, M-estimator, and Maximum Correntropy Criterion (MCC). A convenience of MCC is that it takes advantage of the probabilistic properties of the non-Gaussian random variables, as it uses the correntropy similarity metric, associating that to an optimization problem. Approaches presented in [
16,
17,
18] aim to implement the MCC in the Kalman Filter.
Using the traditional MCC with the Kalman Filter formulation, in ref. [
19], online interaction matrix estimation was implemented, relating feature motion to a robot-attached camera motion. While two non-Gaussian noise scenarios were studied, it is possible to develop a more extensive study in the characterization of that noise in the UVS application. Furthermore, while that study determines camera-to-joint motion using robot geometry, a general estimation to map feature motion directly to joint motion is possible. Although most IBVS tasks present more than one image feature, the reviewed techniques skip the correction step when a single feature presents an outlier measure, keeping the jacobian estimation from the previous iteration, which may not correspond to the system’s current state.
This work proposes a different approach for Maximum Correntropy Criterion Kalman Filtering, designed specifically for applying single-camera uncalibrated image-based visual servoing, where the IBVS control law is affected by a correntropy-induced error vector. It uses a regularization term to ensure that matrices used in the correction step remain non-singular, thus maintaining the estimation update even when some features are detected as outliers by using the values from other features. Also, we propose a simulated annealing to adjust the MCC kernel bandwidth, which regulates the tolerance of the estimation update strategy to large estimation errors that could be caused by measure outliers. Our approach estimates the whole feature to joint space jacobian matrix, handling robot geometry modeling uncertainties through online data. The technique was extensively tested using Monte Carlo experiments, varying robot starting configurations, and applying additive impulsive noise distributions.
This work is organized as follows:
Section 2 presents fundamentals on Uncalibrated Visual Servoing (UVS) and introduces Kalman Filter, how it can be seen as an optimal estimator for state observing, and how it can be applied to UVS.
Section 3 explores the problem and efforts of dealing with non-Gaussian noise on UVS, and how the Maximum Correntropy Criterion is used to address that in the form of two main techniques, MCKF and IMCC-KF.
Section 4 brings the main contribution of this work, describing the problems of the previously presented techniques in the studied context and proposing algorithms designed specifically for it.
Section 5 shows the methodology of the experiments designed to test the proposed technique’s performance, including the system architecture, simulation scenarios, and Monte Carlo experiments.
Section 6 discusses the results in three comparison scenarios.
Section 7 concludes this work, presenting limitations and future work.
Notation 1. Throughout this paper, we introduce some notation rules for clarity. Bold lower case letters denote vectors, bold upper case letters are matrices. Symbol ∧ above a matrix, vector, or scalar represents an estimate of the counterpart. In addition, we define notation to denote a zero matrix.
2. Uncalibrated Visual Servoing
In image-based visual servoing (IBVS) control architecture, where robot motion action is computed directly by image information, image jacobian
describes the relationship between feature displacement
and camera velocity
, as described in
Figure 1. Thus, control action
is designed to compute the required camera velocity based on the error between the desired feature values
and the measured
, and a gain
, as shown in Equations (
1) and (
2).
where
is the pseudoinverse of
. As described in the basics of image formation [
12], the image jacobian varies given the position of the tracking objects in the real world and the camera. When the camera is attached to the robot end-effector, in an eye-in-hand configuration, the motion of robot joints is important for the IBVS. Thus, a general IBVS control law
should consider the robot kinematics jacobian
that relates the end-effector motion to joint velocities and the image jacobian matrix, as shown in Equations (
3) and (
4).
Estimating the image jacobian is the main concern in many uncalibrated visual servoing problems, but there are advantages in using the estimation for the full jacobian. The first advantage is that, as estimation relies on data through iteration, it can handle uncertainties in robot modeling. Secondly, the IBVS algorithm can skip the computation of teh
matrix and the matrix multiplication of Equation (
3). On the other hand, estimating the full jacobian can be harder than estimating the image jacobian, so it is preferable to use an analytical initial guess at the start of the task, through camera calibration and geometric robot modeling, such as Denavit–Hartenberg parameters.
Among the practical considerations related to uncalibrated visual servoing tasks [
20], the error produced by the visual feature tracker should be handled with care. To maintain high precision, visual tracking algorithms are often computationally expensive, which may provide delays in feedback for the entire control system, affecting the whole system’s performance. In contrast, the traditional image-based visual servoing approach is known to be quite robust to image jacobian uncertainty [
12], but dealing with feedback noise effectively requires special considerations in the algorithm formulation. Because of that, Kalman filtering estimation is a recurring approach in UVS studies.
Kalman Filter for UVS
The Kalman filter is a mathematical algorithm introduced as an attempt at a recursive solution to the discrete-data linear filtering problem [
21], and it has become one of the most popular and powerful algorithms in the field of control theory. It takes measurements of a system’s state and combines them with predictions of the system’s state to produce an estimate that is as accurate as possible. Two main stages compose the algorithm: the prediction stage, which estimates using a mathematical model, and the correction stage, which improves the estimation using new measurements.
At its core, the Kalman filter is an optimal estimator that can be seen as a state observer that handles uncertainty [
22]. We consider a causal linear discrete system in the state-space formulation, given the
kth iteration, with state vector
, input vector
, state transition model
, input model
, and observation model
:
That dynamical system presents no uncertainty in the process or measure equations. Thus, a simple observer can be designed to track the system’s evolution, using prediction given the actual state and input, and a set of states consistent with measurement . This implies an estimated state that corrects to the set using . The choice of is related to an optimal gain times an innovation error defined as the difference between the measured output and the predicted output . In that case, that gain is calculated using the fact that for to be the shortest path to , it should be orthogonal.
This fact does not sustain, as the modeled process equation for
and measures
can be uncertain. That is the case in which the Kalman filter is generally presented to the full extent. Considering that uncertainty values present zero-mean Gaussian noise, the uncertainty variable for the process
, and in the measurement
, present covariances
and
, respectively. Thus, the Kalman filter expects a system described by the following expressions:
Also, vectors
and
should be independent random variables. Moreover, using
as the expected value operator and
as the covariance function, the probabilistic conditions of the Kalman filter are summarized below:
The predicted state
is computed using the model dynamics as if there is no noise in the system. The filter predicts the error covariance
using the definition of the covariance matrix expectation. Equations (
14) and (
15) present the prediction steps in the Kalman filter. The filter receives measure
, modeled with white noise
that has covariance
. Then, it computes an optimal gain
using Equation (
16). That gain is important to update the estimative on
and
, as shown in Equations (
17) and (
18), respectively. Particularly in the state update equation, gain
weights the components of the innovation error, directing to where the state should correct its prediction.
The Kalman filter provides an approach that can efficiently handle Gaussian noise in the online estimation of the image jacobian. The works using that approach treat the estimation process as an evolving linear system with associated probabilistic uncertainty. A state vector is presented by concatenating the interaction matrix’s estimated row elements, and the jacobian is estimated through measures of how the features change over time compared to the end-effector motion, so that is a flattened version of or . Thus, the uncertainty is present in how the jacobian estimation evolves and in the measurement of the features, being treated as random variables with Gaussian distribution.
The main assumptions of the traditional Kalman filter are that the system has linear dynamics and the uncertainties in its process and measure follow Gaussian distributions. Although the problem of uncalibrated visual servoing is known to be nonlinear, most studies make approximations to assure linearity, which works best with slow dynamics. Robot motion may not be slow for human perception, but mechanical movement can be considered slow for the algorithm loop if the computation power is high enough. Furthermore, as initially proposed, the Kalman filter is robust to modeling errors at a certain level because it handles process uncertainty. On the other hand, non-Gaussian noise in the measuring step is very likely to appear, especially in unstructured and uncontrolled environments, where approximations may interfere with task performance.
3. Dealing with Non-Gaussian Noise
A simple mission for humans, such as picking fruit from an orchard, could present a challenging problem to computer vision algorithms. During the manipulation task, leaves can suddenly occlude the fruit, removing it from the image frame. If the task is being performed in the field, a cloud in the sky could pass before the sun, degenerating the scene’s illumination. Different fruits can present different textures that may present reflection to light, which can affect processing image steps. Finally, the sudden presence of some other object with similar image properties can confuse the feature tracking. All these are examples of situations where feature tracking can present non-Gaussian behavior, with outliers from a normal distribution.
While outliers can be disregarded using high-pass filters or more complex computer vision treatments, there are situations where the presence of these outliers is significant. For example, these outliers can sometimes be well described by distribution models, like a Gaussian mixture, preserving the statistical properties of the variable. However, the Kalman filter is not prepared to properly handle variables presenting that characteristic. Thus, the Kalman filter would consider the variable to be a Gaussian distribution, where the effect of the outlier is to offer a wrong average as the first-order expected value, and the correction step, which relies on second-order covariance matrices to compute the optimal gain, would not guarantee its optimality.
In paper [
23], UVS with particle filter jacobian estimation was proposed where fuzzy membership functions adaptively change the number of particles. However, that work simulated a holonomic mobile planar robot, which presents no constraints in movement. In study [
24], an iteratively reweighted least-square algorithm to UVS was proposed, implementing an M-estimator that optimizes a residual function. On the other hand, this method requires a memory of pairs involving the feature vector and the respective motor positions vector to aid the estimation, using the K-nearest neighbor pairs to estimate the jacobian for the current state the system is at.
Lately, in [
19], the Maximum Correntropy Criterion was applied to the correction step in the Kalman filter correction; it uses correntropy, a similarity metric for random variables based on their whole probabilistic distribution. There is no need for memory since the algorithm is recursive. As that metric is insensitive to large errors, it is suitable for estimation in the presence of outliers, for example. However, the estimation algorithm used in that work, known as MCKF, as it was originally proposed, avoids the estimation of the whole jacobian if one of the tracking features presents an outlier, leading to potential problems in the IBVS control law. Also, an extension of that study could be applying the estimation for the whole jacobian matrix shown in Equation (
3), instead of only the image jacobian. This work addresses these disadvantages with a new formulation of IBVS and jacobian estimation using the Maximum Correntropy Criterion and the Kalman Filter.
3.1. Maximum Correntropy Criterion
The Maximum Correntropy Criterion (MCC) can take advantage of the statistical properties of non-Gaussian random variables. Unlike traditional estimation criteria, MCC aims to maximize the correntropy between the observed data and the model prediction. The correntropy function measures the similarity between two signals by considering the nonlinear dependencies and higher-order statistical moments. Given two random variables
and
, with joint distribution
, correntropy is defined by Equation (
19).
Kernel function
determines the shape of the correntropy measure. While several kernel options exist, which are shift-invariant symmetric positive-definite functions that follow Mercer’s theorem, the Gaussian kernel is a popular choice in correntropy-based problems because of its favorable mathematical properties. Considering
e as the difference between the random variables, and a kernel bandwidth
, the Gaussian kernel is given by Equation (
20).
The Gaussian kernel has a bell-shaped curve that assigns higher weights to points closer to the curve’s center and lower weights to points farther away. This makes it suitable for measuring the similarity between two signals, as it offers more weight to similar points.
A model estimation problem can implement the MCC to maximize the correntropy between the observed data and the model prediction. Given a sequence of
error data and the problem of learning a parameter vector
of a model from a set of feasible parameters
, an MCC-based estimation is the solution to the optimization problem shown in Equation (
21).
Therefore, MCC effectively increases the value of the error probability density function at zero, which is the natural thing to do in regression or adaptive filtering, where the goal is to increase the number of small deviations between X and Y [
25]. Parameter estimation implementations using the Maximum Correntropy Criterion can solve the optimization problem using iterative methods, such as fixed-point iteration.
3.2. Maximum Correntropy Kalman Filter
A derivation of the Kalman Filter to use MCC instead of the Minimum Mean Square Error (MMSE) criterion was proposed to perform better in non-Gaussian noise scenarios; that technique is called a Maximum Correntropy Kalman Filter (MCKF) [
16]. It uses the same prediction steps as the traditional Kalman Filter, using model propagation equations to estimate the state and the covariance matrix. On the other hand, the MCKF update step uses the correntropy of the innovation error in a Gaussian kernel as an optimization criterion to weigh the correction in the posterior estimation. This approach considers a Kalman linear model, as shown in Equation (
22).
This equation is obtained from the system model in Equations (
7) and (
8) through transformations and matrix decompositions using the following expressions:
where the state vector
has
n values, the measurement vector
has m values,
is a
identity matrix and
is obtained using Cholesky decomposition.
From this formulation, an MCC optimization problem can be defined using Equation (
30), where a Gaussian kernel with bandwidth
evaluates the similarity between the variables composing the innovation error from Equation (
22). That cost function gives the optimal estimate to
, serving as an update step in this derivation of the Kalman filter. That equation uses
as the ith row of
,
as the ith row of
, and
as the ith row of
.
An algorithm using fixed-point iteration can solve Equation (
30). We let
and
, where
describes the estimated state at fixed-point iteration
t, computed as follows:
The fixed-point iteration algorithm requires a stop condition, so that if a new iteration does not bring relative contribution to the estimation, the algorithm stops. That condition is shown in Equation (
38), where the small positive
holds the threshold to either stop the algorithm or demand a new iteration, with
. If a new iteration is required, the algorithm calculates Equations (
31)–(
37), starting with the state vector estimation from the previous iteration.
The covariance matrix update is computed afterward, using the optimal gain . While calculating differs from the traditional Kalman filter, the covariance matrix correction uses the same Equation (18). If an error component presents a very large value, it indicates an outlier, so or can become singular. When that happens, the algorithm stops, and the correction step should be skipped, retaining the last estimation, so .
3.3. Improved Maximum Correntropy Criterion Kalman Filter
Based on the one iteration fixed point approach suggested on [
17], which implies simplifications in the correction equations, in ref. [
18], an algorithm was proposed called the Improved Maximum Correntropy Kalman Filter (IMCC-KF). The goal was to improve estimation accuracy while retaining the computational complexity. It applied a correntropy-induced algorithm where the
norm of the estimation innovation error is evaluated through a kernel function, resulting in a scalar
.
When there is no big error between the estimated and the measured value, i.e., innovation error, tends to 1 and the algorithm is reduced to a Kalman filter. Otherwise, if is sufficiently big, it could imply that the measured value is an outlier, leading to 0, as well as the gain , so there is no update. That approach follows the same analysis that if the kernel bandwidth , it reduces to a traditional Kalman Filter, for all measured values.
4. Regularized Maximum Correntropy Criterion Kalman Filter for IBVS
Using the base formulation of MCKF and the single-iteration considerations of IMCC-KF, we developed a correntropy-induced technique for Kalman Filtering, specifically for the IBVS jacobian estimation and the control law. Therefore, the algorithm actuates in two phases of the visual servoing system. To better illustrate the case in which the proposed approach was designed, we consider an IBVS problem of tracking four circles, each circle presenting two features for the XY coordinates of its center. Due to light problems, occlusion, or detection failure, the vision system states that the circle is temporarily in a very distant spot. In that situation, the covered techniques should detect two features with large estimation errors, such as a potential outlier, and skip the correction step, preserving the last estimated value for the jacobian. This impacts the control law since it tries to minimize the error in the feature space using wrong values for the system’s current state. That way, the correctly measured features from the remaining circles are discarded.
Using the previously presented MCKF, the matrix
degenerates since it is a diagonal matrix and some of its diagonal values are close to zero. This makes
singular, resulting in the algorithm skipping the estimation update for all features in the iteration. To avoid terminology confusion, we define
which is positive semi-definite, and could be singular. However, we can apply regularization term
to enforce a condition to ensure matrix inversion. Since
, we have
that is always positive definite and, consequently, invertible. Therefore, if one of the eigenvalues of
is zero, its respective inverting Equation (
45) will be
. In the context of the present work, since another inversion will happen to compute the optimal gain
, it means that only parts of the jacobian estimation respective to that problematic feature will not be updated.
Algorithm 1 estimates the jacobian based on how feature measures change, compared to how the robot joints update. Symbol ⨂ is the kronecker matrix product operator. One of the main advantages of this approach, in the context of IBVS, is that while the other approaches skip the whole correction step if any feature presents a potential outlier measurement, the proposed algorithm tries to update the jacobian values based on the trusted measures only, depending on the kernel bandwidth , skipping only parts of the jacobian related to the problematic measures.
The regularization in Line 9 allows the algorithm continuation of computing a value for and, consequently, . Considering small, the effect of the outlier measure in the estimation process is reduced to , which does not harm the remaining measures. If the tracking of the ith feature is consistent with its estimation, and , then the correction step works almost like the Kalman Filter.
Algorithm 2 uses the resulting estimation from Algorithm 1 in an outer loop, calculating the robot joint velocities that lead the image features to their desired position. If only the RMCKF is used with the traditional IBVS control law, from Equation (4), a problematic feature tracking measure suppressed in the RMCKF still impacts the feature error computation and produces undesired robot motion. Thus, it is desirable that the jacobian or the error components relative to that outlier should have a close to zero effect in the control law, but not stop the whole robot motion for the remaining features. The proposed algorithm solves that by incorporating the correntropy-induced vector
into the error component, using ⨀ as the element-wise multiplication operator. If the jacobian components were altered instead, the last trusting estimation would be lost, potentially affecting the estimation algorithm convergence.
Algorithm 1 Regularized Maximum Correntropy Criterion Kalman Filter |
Ensure: | ▹ Kernel follows Mercer Theorem |
Require: | ▹ Reading of current features values |
Require: | ▹ Reading of current motor joint positions |
1: procedure RMCKF () |
2: |
3: |
4: | ▹ Measurement may be contaminated with noise |
5: |
6: | ▹ Cholesky matrix decomposition |
7: |
8: |
9: |
10: |
11: |
12: |
13: return | ▹ Returns the estimation, other variables persist for the next loop |
14: end procedure |
Algorithm 2 IBVS with RMCKF |
Require: | ▹ IBVS gain |
Require: | ▹ IBVS end loop time |
Require: | ▹ Discrete timestep |
Require: | ▹ Kernel bandwidth |
1: |
2: |
3: while
do |
4: if annealing = TRUE then |
5: |
6: else |
7: |
8: end if |
9: |
10: | ▹ is a flat version of |
11: |
12: | ▹ Correntropy-induced IBVS law |
13: |
14: |
15: end while |
As implemented with the other approaches tested in this work, a simulated annealing implementation for is shown in Lines 4 to 8. In the context of UVS, that is a useful resource, since a fixed small kernel bandwidth can result in an estimation convergence that is too slow. That annealing, which is exclusive in this work, aims to make the RMCKF work closer to a traditional Kalman Filter at the beginning of the task when the algorithm should expect more learning on the difference between the measures and the correspondent estimated motion. As the task evolves, the algorithm can use a small , so it still learns but becomes more intolerant to impulsive changes in the estimation. The Results section provides a comparison of using the simulated annealing and establishing a fixed kernel bandwidth.
In summary, the three main concerns that the proposed algorithm aims to solve are the following: (1) The correction step for the jacobian estimation should consider the features with reliable measurements and ignore the untrusted ones, not skip entirely if one single measure is a potential outlier; (2) While the IBVS control action should disregard motion for the unreliable measures, the robot should not stop entirely, since it can try to minimize the error components of the remaining measures; (3) At the beginning of the task, the estimation algorithm can be more permissive to bigger changes, but as time goes on, it should become more resistant to those as the estimation would likely be closer to convergence and those impulsive changes have a great chance to be measurement outliers.
5. Materials and Methods
The visual servoing system which is being studied and proposed in this work is in the category of image-based visual servoing (IBVS), which uses the features of the image directly in the feedback. The desired image features values
are compared to the measured features
, as shown in
Figure 2, which depicts the system overview. That comparison generates the error signal
, applied in the visual servoing control law to compute the resulting robot joint motion
. The visual servoing control law requires the IBVS jacobian that is being estimated either from a fixed approximation using the application of image formation theory and robot kinematics modeling equations in Equation (
3) or through the use of recursive techniques, i.e., traditional KF, MCKF, IMCCKF or the proposed RMCKF. As this study aims primarily at eye-in-hand visual servoing configuration, the camera pose is the same as the robot’s end-effector pose.
The robot is assumed to have an inner control loop to receive joint position and velocity commands. Thus, at the start of each loop, the system reads the current robot joint position . Furthermore, a computer vision algorithm extracts and matches features to track how they change between image frames. To simulate the problems in real unstructured scenarios, the system adds non-Gaussian noise to feature tracking using an independent random number generator for each feature, following a proper impulsive probabilistic distribution known as the Lévy -stable distribution.
5.1. Simulation
The UVS scenarios were simulated using the CoppeliaSim robotics simulator, which implements the Bullet physics engine. That tool allows us modelling of a robot, control of its joints, placement of cameras attached to or outside the robot structure, and seting of different parameters on physics, image acquisition, and environment. CoppeliaSim provides an Application Programming Interface (API) in various programming languages to implement custom control algorithms. This study used the Python programming language since its numerical and scientific libraries offer fast prototyping for the discussed techniques.
A Universal Robots UR10 was placed in the simulation using one of the predefined models in CoppeliaSim. This robot was chosen by a correlated study, which described a UVS system using MCKF [
19]. Since this work estimates the conjoined jacobian that relates image features and joint movement directly, a kinematic description of the robot isprovided. The Denavit–Hartenberg parameters for the UR10 are presented in
Table 1, using the standard formulation and link sizes obtained from the official technical specifications. Since all robot joints are revolute, the robot configuration is defined by vector
; thus, the joints are configured in the Kinematic mode to receive joint position commands. The offset values of the
column were defined to match the joint commands to the simulated robot’s joint behavior. The regular API that CoppeliaSim provides functions to read the joint positions directly, simulating an absolute encoder response.
Figure 3 depicts the main simulation scenario, where the robot performs the UVS task. Four distinctly colored circles are positioned on the ground, with the robot camera aiming at them. The XY image coordinates of each circle are detected using image processing, resulting in a total of 8 features. Noise affecting one of the features does not interfere with the measure of others.
The camera object used in simulations is a visual sensor that provides perspective imaging with a resolution of 256 × 256 pixels. That camera object is based on the DVS128 event camera preset model from the CoppeliaSim simulator but with a customized resolution, providing an equilibrium of quality and low latency imaging. It returns an OpenCV-compatible object through the API at each timestep. This allows the implementation of image processing filters directly in the Python controller application. All scenarios were configured with a timestep of 0.05 s, executed in the stepping mode.
5.2. Simulated Noise
For each feature, an independent additive noise following a Lévy
-stable distribution simulates unstructured environment problems that may happen to feature tracking. That distribution is a subset of the stable distribution and shares characteristics with the Gaussian distribution, which becomes a special case of it [
26]. A convenient way to compute it was presented in [
27], which introduced its characteristic equation as Equation (
46).
Parameters and define, respectively, the displacement and the scale, which may be translated to mean and standard deviation in a Gaussian distribution case. The parameter defines the skewness or symmetry of the distribution, so if , the distribution graph is symmetrical. Parameter is the characteristic parameter, as it is related to the number of impulsive values in the distribution. If , the Gaussian normal distribution appears, and as it lowers, the probability of appearing a number very distant to greatly increases.
A normalized symmetric distribution is used for the experiments in this study, which means
,
, and
. When
, the additive noise is a white noise contaminating the feature tracking. Amn
generates an additive noise that sometimes takes the value of the feature far off its detected place, in an impulse, and goes back to a value close to the rightly detected.
Figure 4 shows the noise profile for an experiment with
. Considering that we have 8 image features in our IBVS problem,
are the random variables generated using alpha-stable distribution, producing additive noise to each feature. Their histograms differ because these variables are independent since different features should present different noise values at any simulation time. In the presented case, the impulsive noise is particularly intense in the fourth, fifth, and eighth features, causing a displacement of more than 80 pixels from the true value.
5.3. Monte Carlo Scenarios
Considering that is the characteristic parameter that regulates the behavior of the additive noise on feature tracking, it is the main changing parameter in the simulation scenarios. A Monte Carlo series of experiments is developed with 12 linearly spaced values varying between 1 and 2, as already presents a large number of outliers. For each , independent UVS experiments are carried out. A total of four estimation techniques are implemented and compared in those experiments: traditional Kalman Filter (KF), Maximum Correntropy Kalman Filter (MCKF), Improved Maximum Correntropy Criterion Kalman Filter (IMCC-KF), and the proposed RMCKF for IBVS. All correntropy-induced techniques are using the simulated annealing approach.
This work studies three scenarios of performance comparison, which represent common IBVS feature regulation tasks. The first scenario compares all four techniques in a features translation IBVS task. The manipulator starts in the joint configuration , placing the features in the top side of the camera image. Then, the visual servoing algorithm takes the task of centering the image features to the vector . Although the experiments for each have the same starting conditions, the noise distribution differs for each experiment, resulting in distinct results.
The second scenario differs from the first because each experiment starts with different joint configurations, maintaining all features in the camera view at the beginning of the simulation. Because of that, to reach the same image features desired vector, the IBVS is supposed to perform translation and rotation of the features. Lastly, the third scenario is the same setup as the second, but only the simulated annealing on the RMCKF is being evaluated compared to the fixed kernel bandwidth approach. The Kalman Filter is being used as a baseline of discussion.
The Integral of Time-weighted Absolute Error (ITAE) criterion is used as a performance metric. As 8 features are used, the scenario stores a vector norm of the ITAE results. After all experiments are run, the mean and standard deviation are persisted and shown in an error bar plot, where techniques are compared. Also, the initial value for the state vector is computed using an analytical jacobian approximation using the starting robot joint positions and an initial guess for the image iteration matrix based on a priori camera calibration parameters. Afterward, the estimation algorithm evolves the state vector and the jacobian values.
6. Results and Discussions
This section presents the results of the three Monte Carlo experiment scenarios, and some discussion to explain the performance of the techniques. The first scenario compares the estimation algorithms in a simple feature translation task. The second scenario is harder than the first, because the robot starts at different configurations, requiring translation and rotation of the features. Whereas in previous scenarios, all techniques use simulated annealing, the third scenario evaluates the relevance of the simulated annealing approach for the kernel bandwidth, compared to tests with a fixed kernel bandwidth.
6.1. Scenario 1
Figure 5 shows the results of the first scenario. When the noise distribution profile is Gaussian, i.e.,
, there is no relevant difference between the techniques, which was expected since the Kalman Filter is optimal for problems involving white noise signal. However, more impulsive noise appears for
closer to one, and the correntropy-induced techniques perform better in those situations. In the designed scenario, the advantages of RMCKF over the remaining techniques are shown by the lines representing the mean of ITAE and the standard deviation bar, representing UVS tasks being resolved with less accumulated error and more homogeneity.
6.2. Scenario 2
The results for the second scenario are shown in
Figure 6. The tasks required by that scenario are more difficult for all techniques, as the mean ITAE lines are generally higher than those in the previous scenario results, especially for the highly impulsive noise experiments. The zoom plot aids in showing that the RMCKF performs better than the other approaches. As the robot starting configuration changes over experiments, the standard deviation bar does not present the same effect from the previous scenario. Still, it shows that the algorithm performances are not too far from the mean, except for the
experiments, where some cases fail to center the image features due to a combination of joint configuration and noise leading to out-of-image features.
The better performance of RMCKF is because not all image feature measures present impulsive values simultaneously, so it tries to estimate the jacobian and perform the visual servoing task concisely with the remaining values. The other correntropy-induced techniques skip a new jacobian estimation but still use the old estimation for an IBVS control action, which may lead to incorrect joint motion since the value of both image and kinematics jacobian are highly dependent on the system’s current configuration. The Kalman filter algorithm tries to correct the estimation at all times, even when the image processing algorithm states that the feature is very far from its previous position, e.g., out of the image. However, when only Gaussian noise is expected, KF is the preferred option, because of its simpler implementation.
6.3. Scenario 3
Figure 7 depicts the results of the third scenario. The core concept to understand the importance of the simulated annealing approach here is understanding how the estimation error
evolves from the start to the end of the task. It is important to remember that
is the difference between the
R-normalized measured feature motion
and the estimated feature motion
based on joint movement. Thus, even with an initial jacobian guess,
may be big enough at the start of the IBVS task, requiring higher estimation correction. If the kernel bandwidth
is too small, the algorithm may consider that estimation error originated by a measure outlier, skipping the contribution of the components related to a specific feature.
The RMCKF with fixed
is very intolerant to big estimation changes, leading to poor performance in the IBVS task, worse than KF. On the other hand, the RMCKF with fixed
performs better than KF for very impulsive noise, with results close to the simulated annealing RMCKF. However, as the noise distribution starts to present a more Gaussian aspect, the advantages of the RMCKF with fixed
to the KF disappear, so at
close to 2, the performance of KF is better. More precisely, their performance lines cross at
around
, as shown by the zoom plot in
Figure 7.
The simulated annealing RMCKF has the best performance of the compared approaches. In the execution of tasks, it is tolerant initially, as the algorithm expects bigger changes in the estimation correction and retains the impulsive noise rejection for the rest of the assignment. If a measure outlier appears at the end moments of the simulation, the technique still rejects it, but the convergence from the estimation for the remaining features motion should be sufficiently reliable to still perform the task. When the noise profile is closer to a Gaussian distribution, it works similarly to the Kalman filter.
7. Conclusions
This work presented a new approach to handling non-Gaussian noise in uncalibrated image-based visual servoing tasks. That type of noise is common when feature tracking in unstructured environments, where light, reflexes, and occlusions are not in control. The proposed RMCKF was designed specifically for the IBVS problem, managing the ways on which both the estimation and the control action are calculated, so the robot may perform its tasks with as much information as it can rely upon, even when some of the measures are potentially impulsive outliers. Extensive experiments were driven using random configuration and noise parameters. The results showed that the proposed approach, using the simulated annealing, outperformed the compared techniques in the tested scenarios.
The studied and proposed approaches were correntropy-induced techniques, which took advantage of the characteristics of the kernel-based correntropy criterion in their equations to reach a desired behavior when outliers are present. However, an extended study on how the Maximum Correntropy Criterion can deeply incorporate estimation algorithms is required to explore other characteristics of the involved higher-order statistical moments that are evolving with the estimation.
Impulsive noise in the UVS scenario can possibly appear, when there is fast external interference in the observed objects. However, another case is when something occludes the object momentarily, retaining that noise value for a longer time, and presenting its own dynamic. The correntropy-induced techniques can be robust to impulsive noise, but in the latter case, they are not known to contribute to the task.
Future work will apply the proposed technique to different robot models and mechanical structures to explore the limits of the IBVS jacobian estimation. Also, the RMCKF can be adapted to directly estimate the pseudoinverse IBVS jacobian, skipping a matrix inversion in the control action, but the convergence, advantages, and side effects of that approach are yet to be studied.