1. Introduction
Low-thrust electric propulsion is a favored choice for numerous space missions due to its significant benefits in terms of propellant consumption [
1]. Mission designers often seek to evaluate time-optimal and fuel-optimal transfer trajectories. The optimization of low-thrust trajectories is typically formulated as an optimal control problem (OCP), which can be typically solved by using numerical approaches, classified as direct or indirect methods.
Direct methods are known for their flexibility and robustness, although they require increased computational resources to achieve high accuracy. In contrast, indirect methods offer several advantages, including high numerical accuracy and the ability to ensure that any solution satisfies the necessary optimality conditions. These methods employ Pontyagin’s Maximum Principle (PMP) and result in a boundary value problem (BVP). However, the challenge with indirect methods lies in solving the BVP by using shooting methods, which heavily rely on accurate initial guesses for the costates. Since the costates may not possess physical meaning, providing precise initial guesses is rarely a simple task. This limitation reduces the robustness of indirect methods and makes their implementation time-consuming, especially when evaluating numerous transfers.
To address these limitations, recent advancements have introduced methods such as the enhanced smoothing technique described in [
2] and the Uniform Trigonometrization Method detailed in [
3,
4]. These approaches offer significant improvements in the solution process for indirect methods, even when dealing with control and state constraints. Their contributions underscore that this is a very active area of research, with continuous efforts to improve the practicality and robustness of indirect methods.
Despite these advancements, challenges remain, particularly in multitarget missions, where the objective is to find the optimal sequence of transfers that minimizes an overall mission cost index. In these scenarios, the reliable evaluation of transfer legs is crucial, but the vast number of potential transfers makes classical optimization methods impractical for early mission planning stages. Instead, there is a need for the rapid estimation of transfer costs with sufficient accuracy, without delving deeply into the optimal control law of each individual trajectory.
Active debris removal (ADR) missions in low Earth orbit (LEO) and on-orbit servicing (OOS) missions of satellites represent classical examples of this kind of problems. A key challenge in these missions is the ability to rendezvous with multiple objects within a limited timeframe. As a matter of fact, the time that a servicer spacecraft spends in transfer orbits plays a pivotal role in determining the overall duration of a servicing mission. Therefore, the aim of this study is to explore the use of neural networks as function approximators to enhance the efficiency of designing multitarget missions in LEO by using electric propulsion.
Recently, interest in the application of machine learning (ML) techniques, in particular deep neural networks (DNNs), to different fields of optimal control has grown [
5,
6]. In the existing literature, the use of DNNs has been explored in several contexts which are relevant for space trajectory optimization problems.
First, some studies have applied DNNs for real-time guidance applications [
7,
8,
9,
10,
11,
12,
13], highlighting their potential for immediate decision making in dynamic environments. In particular, a computationally efficient DNN is presented in [
7], demonstrating its near-optimal performance as a controller across different scenarios. It tackles a significant issue in low-thrust propulsion, i.e., missed thrust events, and shows how the DNN can correct the trajectory when such events occur. In [
8], the authors propose an approach that integrates optimal control theory with DNNs to tackle various challenges encountered in spacecraft missions. These challenges include navigating unknown deep space environments, managing limited communication capabilities, and handling complex dynamics. The proposed method involves training a DNN by using state–control and state–costate pairs derived from a high-fidelity algorithm. Specifically, the authors apply this method to address two specific spacecraft missions: the hypersonic reentry problem and the fuel-optimal moon landing problem. The trained DNN plays a pivotal role in enhancing the real-time performance of their algorithm: it is utilized to provide accurate guesses for the initial costates, thereby improving the efficiency of the indirect method during mission execution.
A second application domain of DNNs in space trajectory optimization is as a replacement of traditional methods for solving the BVP produced by indirect methods. Recent studies [
14,
15] have successfully applied this approach, demonstrating the capability of DNNs to simplify and speed up complex trajectory optimization tasks. In [
14], the authors proposed an approach that utilizes artificial DNNs to approximate the solution of optimal control problems by minimizing an error function incorporating the PMP conditions.
An additional application domain, which is of specific interest for the present article, is the use of DNNs to accelerate global optimization processes, particularly those that require frequent evaluation of transfer costs [
16,
17,
18,
19]. The effectiveness of this approach stems from the representational power of DNNs, as they can learn the functional mapping between initial and final states to the objective function. However, obtaining a large and dense dataset for OCPs can be challenging. DNN models can be viewed as complex geometric transformations, and their generalization capability heavily relies on the ability to smoothly interpolate the dataset points. Therefore, having a large and dense dataset is crucial to achieving better models. In the framework of OCPs solved with indirect methods, the collection of samples requires solving BVPs, making dataset generation laborious.
To address this issue, researchers have explored the utilization of adjoint variables, introduced by the indirect formulation of optimal control problems, as a means to regularize the loss function of DNNs [
12,
13]. Adjoint variables, also known as costates, represent the gradient of the objective function with respect to variations in the initial states [
20]. Therefore, a term can be added to the DNN loss function that ensures that the derivative of the DNN’s output (i.e., the predicted objective function) with respect to the initial states given as inputs to the DNN is equal to the initial value of the costates. This regularization aims at enhancing the generalization power of the DNN by constraining it to learn the underlying mathematical law governing the OCP. Consequently, the parameter space (weights and biases) that minimizes the DNN’s loss function is restricted to configurations adhering to this property. This strategy, inspired by physics-informed neural networks (PINNs) [
21], aims to enhance the DNN’s performance even when dealing with small datasets.
This study seeks to leverage DNNs building on these advancements and develop a more efficient framework for planning multitarget missions, specifically for ADR and OOS operations. This work investigates the ability of DNNs to predict minimum transfer times (the value function) for low-thrust transfers in LEO. In [
22], the focus was on building an approximation of optimal transfers for fast evaluations of minimum-time transfers in LEO, but the BVP solution was still required. In contrast, the approach presented here offers a significant advantage: while it can accelerate the convergence of the BVP, its greater benefit lies in its ability to directly estimate the transfer cost, thereby avoiding the need to solve the BVP and reducing computation time. While there is research on using DNNs to estimate the time of flight in LEO transfers [
16,
17], such studies are typically confined to impulsive transfers and do not use the PINN framework. Our research extends this concept to low-thrust transfers, which also involve constraints on state variables, such as maintaining a spacecraft above a certain altitude limit. This often results in complex optimal trajectories with a three-arc structure, where the middle arc is flown at the altitude limit. This significantly complicates the application of indirect methods, as they require the precise handling of boundary conditions and adjoint variables across multiple phases, making the convergence to an optimal solution more challenging.
In the field of space applications, previous research has applied similar PINN methods to OCPs. The authors in [
15] proposed a novel framework for solving BVPs using PINNs in a purely physics-driven manner. Their approach focuses solely on the residuals of the differential equations governing the BVPs. While this method has proven effective for learning optimal controls in intercept problems, it still needs the solution of the BVP and is not suited to obtain rapid estimation costs for multiple transfers. The present work shifts the focus to these rapid estimations; in this context, the regularization of a DNN using adjoint variables to approximate the value function of an OCP has been successfully employed only by the authors in [
12]. Their work, which concerned satellite attitude control without state or control constraints, demonstrated better value function approximations with regularization compared with non-regularized DNNs. In the context of low-thrust trajectories, however, the only prior attempt to use a regularized network for this purpose was made in [
13], specifically in an Earth–Venus transfer. The analysis concerned a set of trajectories in a narrow beam around a nominal path. That work focused on retrieving, for real-time guidance, the optimal controls from the DNN as derivatives of the represented value function with respect to the initial states. While the regularized DNN offered improved control approximations, it failed to outperform the non-regularized network in value function approximation. These limitations highlight a significant gap in the space trajectory optimization literature.
The present study aims at extending and improving the application of regularized DNNs to space trajectory optimization. First, the selection of LEO transfers as an application explores the method’s capabilities to deal with state constraints (in addition to the obvious practical interest for LEO missions). Also, this study identifies key hyperparameters enabling a regularized DNN to outperform a non-regularized one for low-thrust orbital transfers in a wide range of the state space. As discussed in [
13], the benefits of regularization were either limited or not noticeable when estimating the value function. However, the results presented here will demonstrate the effectiveness of the proposed approach and will prove that regularization can improve both the accuracy of value function estimation and the model’s ability to be generalized to new data.
The first step in this study consists in building three datasets with different characteristics. In order to speed up the collection of new samples, namely, time-optimal trajectories in LEO between given initial and final states, partial data are used to train non-optimized DNNs. In turn, these coarse DNNs provide warm-started guesses for the unknowns of new transfers. Once the datasets are collected, a thorough optimization of the DNNs’ hyperparameters is performed by using a Bayesian optimization approach. Then, an assessment of the use of costates as a means to regularize the DNNs’ loss is conducted, and the results are compared to those obtained without regularization. Finally, the method is compared to other state-of-the-art algorithms.
Section 2 provides a description of the optimal control problem and the DNN approach.
Section 3 describes the collection strategy for the datasets and outlines the fine-tuning process of the DNNs’ hyperparameters. The results are presented in
Section 4 and are validated in
Section 5 considering multitarget missions. Finally, the conclusions are drawn in
Section 6.
5. Validation
The fine-tuned
DNN trained on
is compared with traditional state-of-the-art ML algorithms (namely, bagging, random forest, decision tree, and extra tree) to assess the performance of the DNNs and the overall methodology. All algorithms are trained on the same dataset. The hyperparameters are fine-tuned according to
Table 9 by using random search with 5-fold cross-validation, where the training and validation datasets used to train the DNN are combined. The scikit-learn library [
30] is used to implement the algorithms and perform hyperparameter tuning. Each algorithm is allowed a maximum of 100 search iterations. The results of the fine-tuned models on the test dataset are presented in
Table 10, evaluated by using RMSE, MAE, and mean relative error (MRE). MRE is calculated as
where
is the true minimum-time value,
is the predicted value, and
n is the number of test samples. The results clearly demonstrate the superior performance of the DNN, which is capable of predicting optimal minimum-times with remarkable accuracy. Although all ML algorithms perform reasonably well, high accuracy is crucial when these models are integrated into global optimization loops, as transfer time prediction errors can accumulate across multiple transfers.
To further validate these observations, the DNN is applied to a multitarget on-orbit servicing (OOS) mission, where the goal is to minimize the total time required to service 50 satellites (this large value is chosen to highlight the method’s capabilities). The servicer has the same parameters as the spacecraft described in
Section 3.1. The target spacecraft states are randomly generated, with altitudes uniformly sampled between
and
, inclinations between
and
, and initial RAANs uniformly distributed between
and
. The objective is to service all spacecraft in the shortest possible time. The transfer times between targets are time-optimal low-thrust transfers, with the costs estimated by the DNN. A fixed service time of 5 days is added after each transfer to account for rendezvous and servicing operations.
The total number of possible sequences is 50!, which is in the order of
. A beam search algorithm with beam width set to 5000 is used to explore this large solution space. For the sake of comparison, the same beam search analysis is also carried out by replacing the regularized DNN with the other state-of-the-art ML algorithms. After the beam search, the best sequence found by each method is optimized by using the indirect method to find the true total mission time, which is compared to the estimated time. The results for the sequence found with the DNN are presented in
Table 11.
The total mission time error is below 0.2%, which is an impressive result considering that each sequence consists of 50 low-thrust transfers.
Figure 3 compares the performance of the DNN with other ML algorithms on this sequence. The DNN shows a strong ability to estimate transfer times across all 50 legs, producing results that closely align with those obtained by using the indirect method. This high level of accuracy demonstrates that the DNN can reliably approximate transfers, making it a robust and effective surrogate for the indirect method. The other ML algorithms also perform reasonably well in estimating the single transfers that make up the sequence, achieving comparable estimates to those of the DNN. However, slight variations are observed in certain legs, leading to marginal differences in their accuracy. Nonetheless, the overall cost of the sequence is estimated with good accuracy by all methods, with a tendency to underestimating the costs.
The beam search finds the best sequences with an apparently lower time of flight when other ML methods are used to estimate the transfer times. However, verification with the indirect method shows that these trajectories are actually unfeasible. As an example,
Figure 4 shows the sequence found by using the random forest algorithm, which represents the best alternative ML method tested in terms of accuracy (similar results are found with the other algorithms). While the initial transfers are well approximated, random forest begins to significantly underestimate the transfer time by the fifth transfer. This error accumulates, and by the 15th transfer, the true solution shows that actual duration and propellant consumption are four times larger than the random forest estimation. At this point, 75% of the spacecraft initial mass has already been consumed. Additionally, even the DNN estimate begins to diverge slightly, as the sequence ventures outside the DNN’s training domain for mass (and thus thrust acceleration). By the 24th transfer, the sequence found by the random forest model would actually deplete the entire mass of the spacecraft if used as fuel, making mission completion impossible.
This pattern of divergence is consistent across all alternative ML algorithms. Typically, these algorithms handle short transfers fairly well and approximate sequences accurately for such cases. However, for longer transfers, they occasionally misestimate to the extent that a long transfer is estimated as short. As a result, the beam search saves sequences that are based on poorly estimated long transfers, leading to significant cumulative errors. For the DNN, this behavior is not observed: it consistently provides robust estimates across all transfer lengths, ensuring feasible mission sequences.
The consistency and performance of the DNN are further validated through additional beam searches involving shorter missions. A total of 100 sets, consisting of 10 satellites with random elements, are defined, and the optimal 10-leg sequence is sought for each set. The beam width is reduced to 1000. For each trial, the altitudes are uniformly sampled between
and
(in a range broader than the 50-leg mission), and the initial inclinations and RAANs are uniformly distributed between
and
and
and
, respectively. The results of these trials (
Table 12) reaffirm the DNN’s accuracy and reliability. The true mission time (mean value of 237 days) shows large variations and ranges from 185 to 295 days, depending on the random elements of the satellites in each set. Notwithstanding, the DNN’s absolute error averages just 1.13 days, with a standard deviation of 1.00 day. The maximum error is below 6 days. The low variance in error demonstrates that the DNN not only provides precise transfer time estimates but does so reliably across different mission conditions. Furthermore, the results highlight the DNN’s ability to generalize well, even in more constrained mission settings. Shorter sequences imply more demanding maneuvers and longer leg durations, as targets are few and far apart. Despite the reduced beam width, which could increase the likelihood of suboptimal sequences being selected, the DNN maintains high levels of accuracy and consistently estimates feasible mission solutions.
To better evaluate the performance of the proposed methodology, the computation time of the DNN framework is also compared to the estimated computation time of the same beam search procedure if the indirect method were used to calculate transfer costs at each step. On average, the indirect method needs approximately 0.2 s to converge to a solution. Given that the beam search requires evaluating 5,760,050 transfers, this results in an estimated total computation time of about 13 days. In comparison, collecting the dataset, tuning the hyperparameters, and training the DNN takes around 2 days, while the beam search procedure using the DNN is completed in just 1 h. This demonstrates that the proposed method significantly reduces optimization time while maintaining accuracy.
However, the suitability of the DNN method depends on the specific case. In scenarios where only a few transfer evaluations are required, generating the dataset and training the network can be more computationally expensive than solving the transfers directly with the indirect method. Therefore, the proposed approach is particularly advantageous for large-scale problems requiring numerous evaluations, while for smaller problems, traditional methods may be more efficient.
6. Conclusions
This study explored the application of DNNs to predict minimum transfer times for LEO transfers in the context of global optimization and frequent evaluation of transfer costs. The handling of state constraints and the use of the PINN framework are introduced. The results highlight that using costates to regularize the loss during training significantly enhances the DNN’s accuracy, even with limited datasets. Specifically, DNNs with ELU activation functions in their hidden layers demonstrate exceptional performance when combined with this regularization approach. Further exploration is needed to understand the role of DNN depth, as this study did not reveal clear patterns in this regard.
Training the regularized DNNs on datasets comprising one million samples achieved impressively accurate results in estimating transfer times. A warm-started guess strategy, which involves using simpler DNNs to predict transfer times and costates for new transfers, greatly expedites the process of collecting training datasets. This approach proves highly practical for real-world applications, particularly in LEO low-thrust space missions. The indirect optimization method can deal with both minimum-fuel and minimum-time trajectories. In this article, only the minimum-time case is treated in order to maintain the focus on the DNN’s approximation capabilities. The same approach adopted here could be used to train the DNN on datasets of minimum-fuel solutions with different durations; no changes in accuracy should be expected, since the two problems are substantially equivalent and only differ in two boundary conditions at the final time. However, in this case, the optimization of long sequences becomes a daunting task for the beam search (rather than for the DNN), as the length of each leg becomes an additional unknown. The analysis of minimum-fuel sequences will be the subject of future work.
The comparison of the DNN model with other state-of-the-art ML algorithms shows that the use of costates for regularization significantly improves prediction accuracy. Validation using beam search to optimize a sequence of transfer of a multitarget OOS mission shows excellent accuracy, with the DNN achieving errors in total mission time below 0.2%. This is notable, given that each sequence involves 50 low-thrust transfers. In contrast, the other ML algorithms show significant errors, leading to impractically high propellant requirements.
Finally, the comparison of computation time of the beam search procedure using either the DNN framework or the indirect method to calculate the transfer costs shows the enormous time savings offered by the DNN approach. While the indirect method is estimated to take around 13 days to evaluate all transfers, the proposed DNN approach reduces the optimization time to just 2 days, demonstrating substantial efficiency improvements. This efficiency, coupled with high accuracy, underscores the practical advantages of the proposed method, especially for scenarios requiring extensive transfer evaluations.
Overall, the methodology proves highly effective for LEO low-thrust missions and holds significant potential for global trajectory optimization, where it can provide rapid and accurate predictions of minimum transfer costs.