1. Introduction
The optimization of technology represents a highly dynamic field of inquiry, primarily focused on refining various categories of single-objective, multi-objective, and many-objective problems to address complex real-world optimization challenges. Classical approaches such as Newton’s method and gradient descent, commonly used for solving constrained optimization problems, typically require the problem to be convex—a condition rarely met by the non-convex nature of prevalent real-world issues. Additionally, the irregularity of the feasible search space for decision variables, due to imposed constraints, renders traditional optimization methods that rely on gradient information ineffective. In contrast, biologically inspired metaheuristic algorithms and evolutionary algorithms (EAs) do not impose restrictions on problem scale and exhibit enhanced computational and search capabilities. These algorithms have a high probability of converging to global optimal solutions, addressing challenges across various research domains. In recent years, evolutionary algorithms such as Ant Colony Optimization (ACO), proposed through an emulation of ant colony foraging processes [
1], Particle Swarm Optimization (PSO), inspired by bird foraging behavior [
2], the African Vulture Optimization Algorithm (AVOA), modeling the foraging and navigational behavior of African vultures [
3], the Gorilla Optimization Algorithm (GTO), simulating the social behavior of gorillas [
4], the Arithmetic Optimization Algorithm (AOA), based on mathematical operations [
5], Biogeography-Based Optimization (BBO), replicating migration and information exchange between habitats [
6], the Grey Wolf Optimization algorithm (GWO), mimicking the predatory behavior of the Grey Wolf [
7], and the Whale Optimization Algorithm (WOA) [
8], inspired by the hunting behavior of sperm whale populations, have demonstrated significant efficacy in addressing optimization problems. Abundant empirical evidence supports the ability of EAs to discover optimal or near-optimal solutions in global optimization problems. However, inherent drawbacks in traditional evolutionary algorithms, such as slow convergence, low precision, and susceptibility to local optima, highlight the need to enhance their convergence accuracy and search capabilities. Consequently, scholarly efforts are primarily focused on refining existing algorithms, exploring novel evolutionary algorithms inspired by the behaviors of biological populations, and expanding research to boost algorithmic performance.
In recent years, numerous enhanced EAs have been applied across various domains and cited in multiple research endeavors, undergoing experimental comparisons. For instance, the publication [
9] introduces the Adaptive Spiral Flight Sparrow Search Algorithm (ASFSSA), which initializes the population through tent mapping based on random variables, ensuring a more uniform distribution of individual positions. which initializes the population through tent mapping based on random variables, ensuring a more uniform distribution of individual positions. During the discovery phase, an adaptive weighting strategy is amalgamated with the Levy flight mechanism [
10], broadening and enhancing the flexibility of the search approach. In the follower phase, a variable spiral search strategy refines the algorithm’s search range and augments its precision. The feasibility and practicality of ASFSSA are validated through standard test functions and applied to robot path planning. In the work presented in [
11], a Chaos-guided Slime Mold Optimization Algorithm (COSMA) is proposed to minimize the energy cost of high-altitude wind turbines. The COE model is established based on rotor radius, rated power, and hub height to achieve optimal design. In this context, an improved SMA variant called COSMA is introduced, leveraging chaotic search strategies (CSS) and cross-adversarial strategies (COS) for handling nonlinear tasks. COS are introduced to enhance solution diversity and exploration tendencies. CSS are incorporated into the basic SMA to augment developmental capabilities, mitigating early convergence challenges. The algorithm is validated by comparing its effectiveness in high-altitude wind turbines with other algorithms. In [
7], an enhanced Grey Wolf Optimization (GWO) algorithm is presented (MELGWO), combining memory, evolutionary operators, and random local search techniques, further integrating a linear population size reduction technique. This algorithm undergoes comprehensive testing on benchmark functions, engineering optimization problems, data classification, and function approximation, showcasing its superiority over mainstream metaheuristic algorithms in terms of performance. The paper [
12] introduces a novel Particle Swarm Optimization (PSO) variant-Elite Archive-Driven Particle Swarm Optimization (EAPSO), tailored for addressing complex multimodal problems. In contrast to other PSO variants, EAPSO requires only population size and termination conditions to execute search tasks. EAPSO boasts a well-defined structure, establishing three types of elite archives to preserve three different layers of particles. Six learning strategies are then designed to update particle positions by reusing particles from these elite archives. The experimental results establish three types of elite archives to preserve three different layers of particles. Six learning strategies are then designed to update particle positions by reusing particles from these elite archives. The experimental results demonstrate EAPSO’s outstanding performance on over half of the test functions compared to seven other powerful PSO variants, providing highly competitive optimal solutions for considered engineering problems. Moreover, in recent years, various novel intelligent optimization algorithms have been proposed, drawing inspiration from the unique behaviors of diverse organisms in the natural world. Some of these innovative algorithms include the Dung Beetle Optimization Algorithm (DBO) [
13], inspired by the dung beetle’s rolling, dancing, breeding, foraging, and stealing behaviors; the Slime Mold Optimization Algorithm (SMA) [
14], inspired by the slime mold’s foraging behavior; the Golden Jackal Optimization (GJO) algorithm [
15], emulating the hunting behavior of the golden jackal; and the Peacock Optimization Algorithm (POA) [
16], derived from observations of the peacock’s habits and behaviors. These algorithms have found widespread application across various domains, demonstrating notable optimization efficacy.
EAs has shown remarkable results in dealing with optimization problems, in which the solution of optimization problems has a direct impact on the performance and generalization ability of the model, especially in the face of complex models. By combining machine learning algorithms with intelligent optimization algorithms, the limitations of traditional methods in dealing with complex problems can be overcome, resulting in more competitive results. EAs, as a genre of search and optimization techniques, have found extensive application within the realm of machine learning. Confronted with the intricacies of machine learning quandaries, conventional optimization methodologies may find themselves constrained by local optima or entangled in the complexities of high-dimensional data. Intelligent optimization algorithms, by emulating the behavioral patterns of biological populations in the natural world, exhibit the prowess to unearth global optima or suboptima within search spaces devoid of any a priori knowledge. This attribute proves invaluable in addressing the challenges posed by complex machine learning predicaments. The utility of EAs extends beyond singular model training, encompassing facets such as parameter tuning [
17], feature selection [
18], and image recognition [
19]. Their inherent flexibility and scalability position intelligent optimization algorithms as pivotal tools within the domain of machine learning, empowering researchers to grapple with a spectrum of intricate data analysis and predictive tasks. Nevertheless, notwithstanding the efficacy achieved by prevailing EAs’ enhancement strategies, lingering issues persist, including sluggish convergence rates, diminished optimization precision, and susceptibility to local optima. Simultaneously, hyperparameter optimization stands as a critical juncture within machine learning, where the judicious selection of hyperparameters directly influences the model’s performance and generalization capacity. Inappropriately chosen hyperparameters may precipitate model overfitting or underfitting, thereby compromising the predictive and generalization capabilities of the model. In practical applications, given the intricacy and diversity inherent in datasets, the optimal selection of hyperparameters typically unfolds as a multidimensional complex optimization quandary. The judicious and effective selection of hyperparameters serves to enhance the accuracy and stability of the model, thereby elevating the applicability and reliability of machine learning models across a spectrum of real-world challenges. Therefore, in the pursuit of enhancing the applicative worth of intelligent optimization algorithms within the realm of machine learning, this discourse posits a refined iteration of intelligent optimization algorithms. This refinement is conceived to navigate the current algorithmic constraints and amplify their efficacy in the domain of machine learning hyperparameter optimization. The present exposition endeavors to refine the WOA and the ACO algorithm, addressing the exigencies posed by large-scale continuous optimization quandaries, the overall flowchart of the algorithm is shown in
Figure 1. Subsequently, a meticulous delineation of the sources and efficacy of the proposed enhancements shall be expounded upon in the forthcoming sections. It is imperative to underscore that the refined algorithm, as introduced herein, shall not unduly augment the computational intricacies of the algorithm. The primary contributions of this treatise are succinctly encapsulated as follows:
Integrating state transition probabilities into the WOA-based local optimization and the globally oriented exploration based on the Golden Sine Algorithm to guide individual exploration.
Introducing the Flight Ant strategy, inspired by the “flight ants” in ant colony systems, to energize ants trapped in local optima, increasing their chances of escaping.
Proposing a Gaussian scatter search strategy with a multi-stage nonlinearly decreasing standard deviation to balance the algorithm’s global exploration and local exploitation.
Applying MHWACO to optimize the hyperparameters of regression models, such as Support Vector Machines and random forests, and contrasting its predictive performance with eight enhanced algorithms across five distinct domains using real-world data.
The subsequent sections of this manuscript are encapsulated as follows:
Section 2 furnishes the foundational definition of the problem, elucidating the fundamental concepts of ACO, WOA, and pertinent research endeavors.
Section 3 meticulously delineates the algorithm propounded within this discourse.
Section 4 is dedicated to numerical experiments, substantiating the algorithm’s superiority through comparative analyses.
Section 5 applies the algorithm to regression problems in machine learning.
Section 6 encapsulates a synthesis of the algorithm and expounds upon prospective avenues for future research.
2. Related Works
2.1. Constrained Optimization Problems
Taking the paradigm of the minimization optimization problem as an exemplar, the optimization problem mathematical model can be articulated as follows:
Wherein, denotes the objective function, X signifies the decision variables, spanning n dimensions . For each dimension of the decision variable , and , respectively, denote the upper and lower bounds, while represents the inequality constraints, and represents the equality constraints.
2.2. Overview of WOA
Whales are regarded as being among the most emotionally intelligent creatures on Earth. Researchers have unearthed spindle cells akin to those found in the human brain, endowing whales with the ability to engage in communication akin to our own. The WOA mirrors the predation techniques of one of the largest toothed whales, the sperm whale, specifically the bubble-net feeding method [
20]. This feeding behavior comprises two phases: the upward spiral and double spiral. In the initial phase, the whale spirals around its prey underwater at a depth of 12 m, creating bubbles and ascending upstream. The second phase involves coral cycling, capture cycling, and behaviors such as tail-slapping the water surface. Building upon these insights, Mirjalili et al. artfully crafted the WOA.
Assuming the population size is N and the search space is of D dimensions, let the position of the i-th whale in the d-th dimension be denoted as . Here, represents the maximum number of iterations, and the global optimum position is represented as the prey’s location. Given that prior to solving the optimization problem the population lacks any a priori knowledge of the algorithm’s search space, we postulate the current optimal individual’s position as the prey’s location. Other whale individuals will, through diverse computational mechanisms, converge towards the prey’s position. The WOA employs various computational methods for updating whale positions.
2.2.1. Encircling Predation Mechanism
Whales possess the ability to identify prey and encircle them. This process is represented by the following expression:
where
represents the step size,
,
,
is a random number in the range
,
is the convergence factor that decreases to 0 as the number of iterations increases.
C is a random number in the range
, representing the influence of the prey’s position on the distance to the current individual during the iteration process.
2.2.2. Spiral Predation Mechanism
Whales move upstream along a spiral path with a decreasing curvature radius. Using the exhalation of bubbles, they encircle and capture prey. The position update for this process is as follows:
where
represents the distance between the whale and the prey,
b is a constant used to set the logarithmic spiral shape, and
l is a random number in the range
. During this process, the whale roams within the contraction range of the prey while following the spiral path. Therefore, a random number
p is used to set the probability of choosing between the encircling predation mechanism and the spiral predation mechanism.
2.2.3. Random Predation Mechanism
In addition to the spiral predation mechanism, whales also engage in random predation based on the positions of other individuals. This process can be mathematically represented as follows:
where
is the position of a randomly selected whale from the current population,
represents the distance between the current individual and the randomly chosen whale’s position. In this context, the parameter
C signifies the impact of the random position on the distance from the current individual.
The algorithm allocates equal probabilities for the two behaviors of whales, capturing and hunting, by setting the random number
p to 50%. When
, individuals engage in the development phase. Otherwise, they proceed to the search phase. The updated formula is summarized in Equation (
5), and the pseudocode is presented in Algorithm 1:
Algorithm 1 Whale Optimization Algorithm. |
- 1:
Set the initialization parameters, including whale populations size N, dimension D, maximum number of iterations T - 2:
Initialize the position of the whales - 3:
- 4:
while do - 5:
for do - 6:
Calculate the fitness of each agent - 7:
end for - 8:
Update - 9:
for do - 10:
Calculate the coefficients A, C, and l - 11:
Generate random number p in - 12:
for do - 13:
if then - 14:
if then - 15:
Enter the surround prey phase using Equation ( 4) - 16:
Update X - 17:
else - 18:
Enter the surround prey phase using Equation ( 2) - 19:
Update X - 20:
end if - 21:
else - 22:
Bubble net attack prey using Equation ( 3) - 23:
end if - 24:
end for - 25:
end for - 26:
- 27:
end while
|
2.3. Overview of ACO
The ACO algorithm, derived from the foraging behavior of ants in nature, was proposed by Dorigo et al. [
1]. This intelligent biomimetic algorithm exhibits robustness, high positive feedback effects, and the ability to achieve distributed computing. The algorithm’s reliance on random initialization for its search space; however, it introduces excessive randomness, leading to a sluggish convergence rate. Furthermore, the accumulation of pheromones, used by ants to exchange information, can stagnate the algorithm in later stages, hindering its ability to escape local optima.
Initially applied to solve the Traveling Salesman Problem (TSP), where a merchant seeks the shortest closed path passing through each city exactly once, the ACO algorithm operates on the following principles:
2.3.1. Transition Probability
At the algorithm’s onset, ants are randomly distributed across cities, with each ant’s path corresponding to a solution. Throughout the journey, visited cities are added to a taboo list to ensure ant destinations remain unique. When selecting the next city to visit, ants calculate the transition probabilities based on the heuristic information of various paths. At time
t, the transition probability for ant
k moving from city
i to city
j is given by the following equation:
The variable denotes the Euclidean distance between the i-th and j-th cities, represents the residual concentration of pheromones along the path at time t, and characterizes the expected degree of transfer from city i to city j at time t. The parameter signifies the information heuristic factor, where a higher value inclines the current ant towards selecting paths traversed by other ants. On the other hand, denotes the expectation heuristic factor, and an augmented value steers the state transition probability closer to the greedy rule.
2.3.2. Update Pheromone
To prevent an excessive inundation of residual pheromones overshadowing the guiding information, the algorithm, after each iteration, facilitates the evaporation of residual pheromones while introducing fresh pheromones. This mechanism emulates the characteristics of the human brain, which tends to forget outdated information as new insights emerge. The concentration of pheromones along the path is updated according to the following expression:
where
represents the pheromone evaporation coefficient;
denotes the residual pheromone factor. To prevent an infinite accumulation of information, typically the range of
is
, m represents the population size,
signifies the pheromone increment on the
path during this iteration, and
denotes the concentration of pheromones released by the k-th ant on the
path during this iteration.
In discrete problems, the ACO algorithm alters the concentration of pheromones and changes the position status through a calculated point-like distribution. However, in continuous optimization problems, the solution space is an area, and ants need to navigate through the space in a fine-tuning manner to find feasible solutions. Therefore, when applying the ACO algorithm to solve continuous optimization problems, improvements need to be made. There are generally two approaches: first, transform the continuous problem into a discrete one by dividing the search space into discrete intervals, selecting a specific solution in each interval to represent its quality. After choosing the better interval, further optimization is performed using EAs. Second, directly disperse ants into the search space, constructing a pheromone calculation method that is combined with the continuous space. Based on this, Bilchev and Parmee [
21] initially applied ACO to continuous optimization problems. They argued that during the search process, the population repeats searching the same space, leading to computational waste. Therefore, they divided the search space into multiple regions, allowing each individual to search only within their assigned region, reducing the possibility of algorithm repetition. The widely adopted approach for function optimization using ACO involves initializing ants randomly in the solution space, calculating the fitness function value, and determining the global optimum. Then, according to Equation (
8), the state transition probability value
is calculated for the
i-th ant.
where
represents the maximum value of pheromone concentration in the current iteration of the population,
is the minimum value of pheromone concentration, and
is the pheromone concentration of the
i-th ant. If
is less than a random number
, it indicates that the current ant has a higher pheromone concentration, resulting in a smaller corresponding action probability. In this scenario, the ant needs to conduct a local search around the current solution using the Equation (
9). Conversely, if
is greater than
, it suggests that the current ant has a higher action probability. In this situation, the ant should follow the Equation (
10) to perform a global exploration, moving out of the current region in search of feasible solutions to the problem.
Where
represents a decreasing factor that reduces with the increase in the number of iterations,
is a random number within the range [−1, 1], and
and
denote the upper and lower bounds of the search space, respectively. Before the end of each iteration, the pheromone concentration of each ant is updated to be used in the calculation of transition probabilities in the next round. The pseudo-code for the algorithm is as in Algorithm 2.
Algorithm 2 Ant Colony Optimization. |
- 1:
Set the initialization parameters, including whale populations size N, dimension D, maximum number of iterations T, pheromone evaporation coefficient - 2:
Initialize the position of the whales - 3:
t = 0 - 4:
while do do - 5:
for do - 6:
Compute the probability of state transition by Equation ( 8) - 7:
Create a random number - 8:
if - 9:
Enter Local search using Equation ( 9) - 10:
else - 11:
Enter Global search using by Equation ( 10) - 12:
end if - 13:
end for - 14:
Update pheromone concentration for each ant for the next iteration. - 15:
- 16:
end while
|
2.4. Review of Improved Algorithms
Due to its primary application in discrete optimization problems, the ACO algorithm has been less explored and enhanced for continuous optimization problems [
22,
23,
24]. Therefore, this section predominantly delves into the literature on the application and improvement of the WOA. Since its inception, the WOA has garnered extensive attention from researchers across various domains. Scholars have applied it to diverse practical optimization problems, including engineering, clustering, robotic pathfinding, image processing, task scheduling, and more. Presently, the avenues for enhancing EAs primarily focus on integrating various strategies to improve algorithms and combining them with other intelligent optimization algorithms. One approach involves blending two algorithms, creating a model that amalgamates the strengths of both. This model enhances the algorithm’s exploration and exploitation capabilities, yielding favorable results when applied to specific problems. In a notable study, Xu et al. [
25] combined chaotic local search with the WOA, introducing a novel hybrid optimization algorithm. The algorithm demonstrated superior convergence speed and solution accuracy over other algorithms across 48 benchmark functions. Singh and Hachimi [
26] proposed a novel hybrid metaheuristic optimization algorithm, leveraging the spiral equation of WOA to balance exploration and exploitation in GWO. This approach, applied to the entire population, prevented premature convergence and local minima. The results showcased higher stability, faster convergence, and greater computational accuracy in benchmark test functions, breast cancer, welded beam design, pressure vessel design, among other problems. Similarly, Laskar et al. [
27] introduced the Hybrid Whale–PSO Algorithm (HWPSO), blending the WOA with the Particle Swarm Optimization (PSO) algorithm to overcome the tendency of PSO to get stuck in local optima during the exploration phase. The hybrid process introduced two novel techniques: “forced whale” and “limitation”. The forced WOA in the exploration phase assisted PSO in avoiding local minima, while the new limitation phenomenon restricted the WOA’s search mechanism during the development phase, aiding faster convergence to local optima. HWPSO’s performance was tested on 18 benchmark mathematical functions and 3 electronic design optimization problems, demonstrating favorable outcomes. In another endeavor to improve, the authors of [
28] proposed the Enhanced WOA (EWOA) to address high-dimensional optimization problems. Analyzing the WOA’s limitations, they made modifications to enhance its exploration capability, accuracy, and early convergence issues. A unique selection parameter was introduced to balance global and local search phases. Modifications were made to the coefficient vectors A and C for better exploration and utilization of the search region. The algorithm allowed random movement during the exploration phase to reduce computational burden. An inertia weight was introduced during the optimization phase for a thorough search near the optimal solution. Comparative experiments validated the algorithm’s superior performance in high-dimensional problems. Lastly, Lin et al. [
29] introduced a heuristic algorithm based on the WOA and partitioning strategy (NHWOA) for solving global multidimensional engineering optimization problems. The algorithm incorporated partitioning techniques to enhance population diversity, suppress premature convergence, and adjust algorithm parameters through heuristic mechanisms to boost the search agent’s exploration potential. The application of the algorithm to verification cases of CEC2014 benchmark functions and global optimization practical engineering problems demonstrated its robust global computational capabilities.
Increasingly, scholars are applying enhanced versions of the WOA to address practical problems. Jadhav et al. [
30] introduced a data clustering algorithm named WGC, which incorporates a novel fitness function and integrates the principles of GWO and the WOA. WGC performs data clustering by computing optimal centroids, with the fitness function dependent on three constraints: inter-cluster distance, intra-cluster distance, and cluster density. The clustered data encompass essential information for decision-making, offering crucial insights. The experimental results indicate that the proposed WGC surpasses existing methods, holding significant relevance in decision-making within complex data environments. Bentouati et al. [
31] presented a novel hybrid optimization algorithm for solving optimal power flow problems in power systems. This algorithm combines the WOA with the pattern search algorithm (PS) and is applied to the IEEE 30-bus test system, considering multiple objective functions such as generation fuel cost, voltage profile improvement, minimization of total power losses, and emission reduction. The simulation results demonstrate the effectiveness of the algorithm in solving the optimal power flow problem. Karlekar et al. [
32] proposed an Ontology-based Whale Optimization Support Vector Machine (OW-SVM) approach for the privacy-preserving classification of medical data. This method initially utilizes the Kronecker product bat algorithm to generate privacy-protected data, followed by ontology construction for feature selection. The combination of ontology and the WOA is then integrated with support vector machines in the OW-SVM method. Experimental comparisons using three heart disease datasets show that the OW-SVM method outperforms other approaches in accuracy, sensitivity, specificity, and adaptability. In another study, the WOA was applied to optimize a Support Vector Machine (SVM) model for predicting tunnel squeezing behavior [
33]. The research involved the classification of 114 real cases, considering multiple input parameters. The evaluation results demonstrated that the WOA-SVM model achieved the highest accuracy in classifying tunnel squeezing, reaching an accuracy of approximately 0.9565, showcasing high accuracy and reliability in predicting tunnel squeezing. Yang et al. [
34] proposed a multi-strategy WOA (MSWOA) applied to engineering problems. The paper introduced four strategies, including the use of highly randomized chaotic logic mapping to generate high-quality initial populations, enhancing algorithm development and exploration capabilities through adaptive weights and dynamic convergence factors, introducing the Levy flight [
10] mechanism to maintain population diversity. Additionally, the paper discussed the application of Semi-Supervised Extreme Learning Machines (SSELM) in log recognition and optimized SSELM parameters using MSWOA. Experimental comparisons illustrated MSWOA’s significant advantages and effectiveness in solving global optimization problems. Anitha et al. [
35] proposed an improved WOA (MWOA) for optimizing the threshold in color image threshold segmentation. Otsu’s and Kapur’s functions were used as fitness functions, controlling individual positions through cosine functions, and introducing correction factors to adjust individual search processes. The strategies included in MWOA achieved a balance between local optimization and global exploration, reducing algorithm computation time. Liu et al. [
36] introduced a WOA-based hybrid algorithm (WOA-LFDE) for solving Job Shop Scheduling Problems (JSSP), combining Levy flight and differential evolution during the iterative process. Experimental results and statistical analysis demonstrated that the algorithm outperformed other competitive algorithms.
In the seven short years since its inception in 2016, researchers have addressed the WOA’s limitations by enhancing the algorithm itself or integrating it with new algorithms to construct novel models applied to various fields to solve real-world problems. While the WOA has been widely researched in multiple domains, little research has been conducted on high-dimensional optimization problems and combined algorithms with ACO. As ACO is typically used for discrete optimization problems, there is minimal research on variants of the WOA combined with ACO to solve function optimization problems. This paper aims to amalgamate these two algorithms, harnessing their respective strengths to create a unique optimization spark. The construction of an effective optimization algorithm to address high-dimensional optimization problems is an urgent challenge. Thus, filling the research gap in ACO and exploring new avenues for algorithm improvement applications not only promotes further algorithm development but also offers a new perspective for algorithm improvement applications. Based on this, this paper attempts to merge a multi-strategy hybrid ACO algorithm with a WOA to solve parameter optimization problems in high-dimensional optimization and real-world scenarios.
3. Proposed Solution
In this section, we shall elucidate a novel MHWACO algorithm devised for the resolution of optimization quandaries. The algorithm combines the WOA and ACO, integrating multiple strategies. Initially, a Gaussian perturbation is applied to the initialization of data points, fostering a more uniformly efficient exploration of the search space. Subsequently, the determination of utilizing either a local optimization algorithm or a global search algorithm is contingent upon the pheromone concentration information. One facet involves the infusion of an enhanced WOA’s spiral predation mechanism into the algorithm, aspiring to augment the precision of algorithmic searches. The other facet employs the golden sine algorithm to enhance the algorithm’s global exploration capabilities. Following this, a flight ant search strategy is applied to points in the algorithm that remain unchanged. This strategy aids individuals that have not ventured beyond local optima, endowing them with the energy to explore a more extensive solution space for superior outcomes. If the proportion of unchanged individuals surpasses a threshold, a Gaussian scatter search strategy is invoked. This aims to comprehensively balance the algorithm’s capabilities for global exploration and local optimization. The parameter values utilized in the text are detailed in
Table 1.
3.1. Gaussian Initialization
In the realm of intelligent optimization algorithms employing parallel iterations, the initialization of the population significantly influences the algorithm’s search capability and convergence precision [
37]. Conventional intelligent optimization algorithms often employ a randomized initialization method to generate the initial population. Individuals generated in this manner are likely not dispersed throughout the entire search space, leading to the algorithm requiring multiple iterations to produce feasible solutions. Unfortunately, there is a risk of the algorithm becoming ensnared in a local optimum and being unable to escape. Moreover, some intelligent optimization algorithms heavily rely on the quality of the initial solution. Therefore, initialization methods fostering strong diversity lay the groundwork for superior global exploration. Presently, various methods addressing initial solutions have been proposed, including Opposition-Based Learning (OBL) [
38] and chaotic mapping initialization [
39]. These strategies aim to reduce the randomness of initialization, providing the algorithm with an initial population that optimally supports iterations, minimizing the likelihood of falling into local optima. In this work, Gaussian mutation is employed during initialization. Initially, the algorithm generates
N random points in a uniform distribution within the interval
, where each point is chosen with equal probability. Subsequently, Gaussian perturbation is applied to mutate the population, ensuring diversity throughout the algorithm’s solving process. Samples generated by Gaussian mutation follow a Gaussian distribution, a continuous probability distribution whose Probability Density Function (PDF) describes the scenario where a random variable follows a normal distribution. The PDF of the Gaussian distribution is typically expressed as follows:
In this context, x signifies the value of a stochastic variable. stands as the mean, representing the central position of the distribution. denotes the standard deviation, illustrating the extent of dispersion in the distribution. The Gaussian distribution manifests as a bell-shaped curve, with its peak situated at the mean , while the standard deviation dictates the breadth of the curve. In this manuscript, random points serve as the means, and the standard deviation is set to , denoted as Gaussian(). Following Gaussian perturbation of these random points, N points are generated. Upon calculating the fitness function values, the points are sorted, and the top N points are selected as the initial population for the algorithm. This approach yields an initial population with a more uniform distribution in the search space compared to random initialization. It provides a superior starting point for the algorithm’s iterations, preventing it from succumbing to local optima. Simultaneously, the impact of Gaussian perturbation is governed by the parameters of mean and standard deviation. When extending Gaussian initialization to other domains, these parameters can be adjusted according to specific requirements.
3.2. Local Search Strategy
In the ACO algorithm, the ant colony considers the concentration of pheromones on a path to decide whether to traverse it, leaving behind new pheromones after traversing. Simultaneously, the pheromones on this path will evaporate over time. As time progresses, paths with higher residual pheromone concentrations are more likely to be chosen, thereby achieving a balance between global exploration and local exploitation. The transition probability in this paper is calculated as . When the ant encounters lower concentrations of pheromones, there is a greater probability of global exploration, whereas with higher pheromone concentrations, the ant is more likely to engage in local optimization. By introducing a randomly generated number , when , the algorithm performs global exploration, and when , the algorithm engages in local exploration.
Traditional EAs often struggle to adjust the search state based on an individual’s position in the search space during local exploration. This results in low adaptability and unsuitability for different search states. Faced with a larger search space, random search directions can cause the algorithm’s solutions to be too dispersed, making convergence to the global optimum challenging. In such scenarios, it becomes essential to guide the population’s purposeful movement to enhance the algorithm’s search capability. To address this, this paper integrates the spiral predation mechanism from the WOA to capture prey for precise searching. This strategy assumes that in a search space where the optimal position is uncertain, the prey’s location is either at the optimal value or in its vicinity. Based on this assumption, other whales update their positions accordingly. The updated points are evaluated based on their objective function values, and the one demonstrating optimal performance becomes the new position. The updating formula is as follows:
Replace with in the original algorithm, as in the later stages of the algorithm, one can fully leverage individual information to explore the potential for optimal solutions around each individual. The step size D is set as , where multiplying by ensures that individuals in the population converge toward the most optimal individual while fully utilizing positive feedback effects. Allocating shorter step sizes to individuals with higher pheromone concentrations enables them to make efficient use of valuable information for precise searching. Thus, this paper sets a shorter step size to assist the algorithm in finely adjusting individual positions, bringing them closer to the global optimum and thereby enhancing the algorithm’s search precision and convergence capability. Additionally, this approach prevents the algorithm from prematurely leaving the vicinity of the global optimum, ensuring that it can conduct more detailed searches near the global optimum without missing the optimal solution due to excessively large step sizes.
3.3. Global Search Strategy
The traditional ACO algorithm is better suited for discrete combinatorial optimization problems [
40,
41], but its performance in continuous optimization problems is comparatively inferior to other EAs. When confronted with high-dimensional functions, ACO demands extensive iterations to uncover viable solutions, significantly diminishing the algorithm’s convergence velocity. Presently, the prevalent ACO algorithms employ a global search method, as shown in Equation (
10). This computational approach lacks the capacity to sufficiently escape local optima when multiple peaks exist in the solution space. Therefore, this paper integrates the Golden Sine Algorithm (Golden-SA) [
42] to enhance the global search strategy.
The Golden-SA proposed by Tanyildizi et al. [
42] in 2017 is a novel intelligent algorithm. Inspired by the sine function, this algorithm traverses all points on the sine curve, expanding its exploration range. Unlike EAs, the Golden-SA does not simulate phenomena in the natural world, eliminating the need for assumptions during computation. Additionally, the algorithm incorporates the golden ratio to obtain coefficients for individual position updates, ensuring comprehensive exploration of the solution space in each iteration. This significantly enhances the algorithm’s global exploratory capabilities and accelerates convergence efficiency.
The essence of the Golden-SA lies in the process of solution updating. Throughout the algorithm’s iterative journey, a solution corresponds to an individual’s position, with
signifying the global optimum during the t-th iteration. When
, the algorithm proceeds to update the individual’s position for the
iteration according to Equation (
13).
Wherein,
represents a random number within the range
, utilized to determine the individual’s displacement.
is a random number within the range
, employed to ascertain the individual’s direction of movement. The coefficients
and
are derived from the golden ratio
, with
and
. Following an ant’s local or global search, the fitness function value of its new position is computed; if superior, the position is replaced, and if inferior, the original position is retained. The pseudocode for local search and global search is shown in Algorithm 3.
Algorithm 3 Global + Local Search Algorithm. |
- 1:
Input: The Position Of The Ants - 2:
for do - 3:
Calculate A Random Value - 4:
Calculate - 5:
if then - 6:
Calculate By Equation ( 12) - 7:
else - 8:
Calculate By Equation ( 13) - 9:
end if - 10:
if then - 11:
- 12:
end if - 13:
end for
|
3.4. Flight Ant Strategy
In the realm of reality, ant colonies exhibit a profound social structure, featuring a subset of ants equipped with wings that enable them to leap further, unbound by the conventional stride limitations of regular ants. These remarkable ants, known as “flight ants”, converge at specific times and locations, forming extensive airborne colonies. This strategy ensures that the population can explore and establish new ant nests in broader areas with abundant food sources, fostering the lifecycle and proliferation of the ant colony. When applied to practical problems, introducing flight ants aids the colony in exploring the search space more extensively, uncovering potential global optimal solutions. In instances of elevated pheromone concentrations, ants tend to get trapped in local optima, reluctant to explore new paths. Ants endowed with the ability to fly can transcend the constraints of pheromone concentrations, boldly experimenting with novel routes, thereby enhancing the probability of global search. The incorporation of this strategy helps prevent ants from concentrating around local optima, increasing the algorithm’s chances of discovering superior solutions. Hence, this paper proposes the Flight Ant mechanism (FA) to guide the entire ant colony in seeking feasible solutions to the problem.
This paper bestows the capability of jumping upon individuals whose positions remain unchanged, enabling them to leap beyond their current boundaries. Simultaneously, to address the issue of traps in regions with high pheromone concentrations, we integrate the transition probability into the calculation of flight ants. This grants ants with high pheromone concentrations stronger jumping capabilities, ensuring a more comprehensive search of the problem space by the ant colony. To simulate the jumping actions of ants, this paper incorporates the Levy flight function [
10] into FA, with the calculation method presented as follows:
Wherein,
represents the state transition probability of the
i-th ant.
denotes the distance from the
i-th ant to the global optimum. The angular parameter,
, governs the periodicity and phase of the cosine function, calculated as
, with
being a random number between [0, 1]. The parameter
a is computed as
. This paper computes the random step lengths conforming to the Levy distribution using Equation (
15).
Wherein,
and
v conform to a normal distribution:
The definitions of
and
are as follows:
The range for
is [0, 2], typically chosen as
. The step size of Levy flight, distinct from Gaussian random processes, follows a Levy stable distribution. Levy flight has been widely applied in the improvement of EAs, enhancing the performance of algorithms in combination with approaches like the AOA [
43], Seagull Optimization Algorithm [
44], and Crawling Animal Search Algorithm [
45]. These modified algorithms have shown improved performance in various domains. Through experiments, this study validates that the introduction of the FA mechanism significantly increases the likelihood of the ACO algorithm escaping local optima when addressing complex problems. This mechanism empowers a subset of ants with robust mobility, enabling them to explore problem spaces more extensively and enhancing the chances of finding global optima. The successful application of this strategy can substantially optimize the performance of the ACO algorithm in solving function optimization problems, making it more suitable for intricate optimization tasks.
3.5. Gaussian Scatter Search Strategy
Inspired by the stepwise search strategy applied in natural language processing as presented in the article [
46], we propose a Gaussian scatter search (GSS) strategy. The original algorithm’s concept involves selecting a target path in the solution space for an optimization case, sequentially searching variable dimensions, updating scatter point steps, and dynamically conducting a multi-point search following this approach until generating test cases that cover the target path. Integrating this concept into the ACO algorithm ensures the algorithm covers the solution space as extensively as possible, thereby enhancing algorithm diversity.
When the proportion of unchanged individuals in the population exceeds a threshold, the search strategy proposed in this section will be employed to strengthen the algorithm’s exploration capability in the solution space. The Gaussian scatter search strategy consists of two main steps. Firstly, Gaussian perturbation is used to generate a target solution
for the optimization individual. Gaussian perturbation is characterized by being controlled by the mean and standard deviation, where the mean determines the central location of the disturbance value distribution, and the standard deviation determines the extent of the distribution’s spread. This means that most of the generated disturbance values are concentrated around the mean, with a lower probability of points appearing far from the mean. Taking advantage of this characteristic, this study utilizes the optimal solution of each iteration as the mean for adaptive Gaussian perturbation, and the generated disturbance points serve as target solutions. Due to its stochastic nature, it ensures that the perturbed data points exhibit a certain degree of variability. The standard deviation uses a piecewise non-linear decreasing function to balance global exploration and local exploitation of the algorithm. Setting a large standard deviation in the early iterations allows the random values to disperse widely, facilitating a global search and enhancing the algorithm’s ability to escape local optima. In the later iterations, setting a smaller standard deviation is beneficial for local exploitation, improving the algorithm’s convergence accuracy. Based on the above analysis, the algorithm is divided into three stages: early, middle, and late. In the first and third stages of the evolutionary algorithm, a non-linear decreasing formula is used to calculate the standard deviation. In the second stage, a linear decreasing formula is used to calculate the standard deviation. The introduced calculation formulas are as follows:
The second step involves generating a step length, named ’step’, based on the target solution to search for a more optimal solution in the vicinity of the current solution. This process is illustrated in Algorithm 4. Initially, the data points are input to be optimized and the step length is generated based on the distance between the current point and the target point. When the iteration count is less than
S, a new position is created based on the current point and step length. After computing the fitness function value, compare it with the existing one. If superior, replace it; otherwise, retain it. Proceed to the next iteration until the termination condition is met.
Figure 2 provides an illustrative example of this strategy using the Ackley benchmark test function. Due to size constraints, the chart displays only four decimal places. Random Gaussian perturbation is applied to the data point (1.5, 1.5, 1.5) to generate the target point (3.9329, −0.5941, −0.9325), resulting in an initial step length of (1.2164, 2.0941, 2.4325). Subsequently, test cases are generated based on this step length, starting with the first dimension’s test values. After calculating and comparing the function values for the three points, the point with the minimal fitness function value (0.2835, 1.5, 1.5) is selected to replace the original point. The new step length is computed for the next iteration. After ten iterations, the optimal value for the first dimension is determined to be 0.0007, which is saved for subsequent iterations in the second and third dimensions. The final point obtained is (0.0007, 0.0009, 0.0010), serving as a superior point that supersedes the original data point. The intentional use of the GSS allows the algorithm to approach optimal values purposefully, effectively avoiding the possibility of blindly exploring the solution space. Simultaneously, it enhances the diversity of the search, thereby increasing the algorithm’s likelihood of discovering the global optimum.
Algorithm 4 GSS Calculation Process. |
Input: Ant positions , search space bounds , - 1:
for to N do - 2:
Generate a target point by Equation ( 18) - 3:
- 4:
for to D do - 5:
- 6:
while do - 7:
- 8:
- 9:
- 10:
Boundary handling - 11:
Calculate the fitness of , - 12:
- 13:
- 14:
- 15:
end while - 16:
end for - 17:
end for
|
3.6. Updating Pheromones
In the ACO algorithm, pheromone updating refers to the process of updating pheromones based on the paths traversed by ants. Pheromone updating involves two steps: simulating the real-world process of pheromone evaporation, preventing an excessive accumulation of pheromones that might diminish the algorithm’s global exploratory capability, and ants leaving pheromones along the paths they traverse, guiding other ants to have a higher probability of following these paths. This paper updates the pheromone concentration at the end of each iteration, and the calculation formula is as follows:
where
is the pheromone evaporation coefficient,
is the residual pheromone concentration multiplied by the current pheromone concentration,
Q is the pheromone weight, and
is the fitness value of individual
. The formula for
is
, with
and
set to provide an adaptive pheromone evaporation mechanism, enhancing the algorithm’s robustness and adaptability to changes in the problem space. This adaptive approach prevents drastic changes in results due to the influence of previous search paths. Reducing the pheromone concentration with the iteration count makes the algorithm more flexible, allowing it to adapt to complex problem spaces without getting stuck in rigid states induced by accumulated pheromones.
In summary, this paper proposes a large-scale optimization algorithm based on a hybrid of ACO and a WOA. The article initially employs Gaussian initialization to provide high-quality initial individuals for the algorithm. Subsequently, the transfer probability
determines whether the algorithm focuses on local or global optimization. The incorporation of the WOA’s local optimization strategy enhances the precision search capabilities of individuals with high pheromone concentrations. Simultaneously, the integration of the Golden-SA’s global optimization strategy improves the spatial search abilities of individuals with low pheromone concentrations. The algorithm employs the FA strategy for unchanged points, providing energy to escape local traps. When the algorithm meets
, indicating low activity among individuals in the population, the GSS is utilized to guide unchanged points in exploring potential better solutions. The overall flowchart of the algorithm is depicted in
Figure 3.
4. Results and Experiments
In the forthcoming sections, we will carefully analyze and validate the efficacy of the MHWACO algorithm for addressing function optimization problems. This chapter aims to scrutinize the rationale behind the proposed method, comparing its outcomes with algorithmic variants and innovative intelligent optimization algorithms. The comparative experiments involve 34 classical benchmark functions and the CEC 2022 test suite; the benchmark functions are as illustrated in
Table 2. The algorithmic implementation is scripted in Python 3.8, utilizing PyCharm 2021 as the development tool. The experimental setup is executed on a computational system equipped with an Intel(R) Xeon(R) E-2224 processor, operating at a clock speed of 3.40 GHz, and boasting 16.0 GB of memory.
In
Section 4.1, we juxtaposed the results of functions in dimensions 30, 50, and 100 under varying population sizes, validating the algorithm’s robustness. In
Section 4.2, to substantiate the efficacy of the proposed strategies, we conducted comparative experiments by amalgamating the results of each strategy across dimensions 30, 50, and 100. In
Section 4.3, we compared our algorithm with newly proposed intelligent optimization algorithms to affirm the sophistication of the MHWACO algorithm. In
Section 4.4, to demonstrate the effectiveness of our proposed algorithm in addressing high-dimensional optimization problems, we assessed the algorithm’s optimization capabilities in dimensions 200, 500, and 1000. Given the stochastic nature of EAs, results may exhibit variability in each round. Consequently, we computed the average of the experimental results after 30 runs to derive the final outcome. The algorithm employed consistent experimental parameters: population size N = 100, maximum iteration count T = 2000, GSS maximum iteration count S = 10, and pheromone weight Q = 1. The parameter settings for the improved algorithms used in comparative experiments and the parameters of novel intelligent optimization algorithms are outlined in
Table 3.
4.1. Population Size Analysis
The population size plays a crucial role in the effectiveness of population-based EAs, where a larger population can comprehensively cover the solution space, enhancing the algorithm’s stability. However, an excessively large population increases computational costs, diminishing the algorithm’s convergence efficiency. Conversely, a too small population size may lead to the algorithm becoming trapped in local optima, exhibiting poor robustness. Thus, it is imperative to determine an appropriate population size, striking a balance between resource utilization and algorithm performance. This section conducts experiments with population sizes set at 25, 50, 75, 100, and 150, as shown in
Table 4 (
–
).
From the table, it is evident that as the population expands, the algorithm’s convergence accuracy improves, increasing the probability of converging to the optimal solution. However, this enhancement effect is not particularly pronounced for the proposed algorithm in this study. For functions to , to , , and to , the algorithm converges to the optimal solution regardless of the population size. Even in the case of function , there is no significant difference in the optimal solution after reaching a population size of one hundred or more. This suggests that the search performance of MHWACO has a low dependence on different population sizes, and a larger population does not necessarily yield superior solutions than smaller populations. The algorithm’s obtained optimal solution is largely dependent on the search mechanism designed in this study. MHWACO can more effectively utilize each individual, whereas the Golden-SA globally optimizes individuals’ meeting conditions, enhancing the algorithm’s exploration range. Furthermore, LF optimizes based on unchanged points, strengthening the algorithm’s ability to escape local optima. The GSS maximizes the use of individuals in the population, enhancing the algorithm’s coverage in the early stages and precise search capability in the later stages. The elite search strategy promptly adjusts the search direction, efficiently utilizing information carried by excellent individuals, guiding the population to converge rapidly to the optimal solution. Therefore, even for smaller populations, the effective utilization of outstanding individuals can facilitate efficient searching. The analysis results indicate that MHWACO possesses high robustness and is less susceptible to the influence of population size.
4.2. Strategy Effectiveness Analysis
This section will intricately dissect the refinement strategies to validate their efficacy in seeking the optimal solution for the posed problem. The algorithm’s specific manifestations are delineated in
Table 5. Through this comparative approach, the distinct effects of each refinement strategy can be vividly showcased. The comparative outcomes are illustrated in
Table 6.
From the aforementioned table, it is evident that MHWACO attains mean and standard deviation values on various benchmark functions that are smaller than those of ACO, ACO1, ACO2, and ACO3. Moreover, in the majority of functions, MHWACO exhibits precise convergence to the theoretical optimum, showcasing a heightened level of convergence accuracy. Additionally, the convergence accuracy of ACO and its three variants noticeably diminishes with an increase in dimensionality. Even when D = 100, MHWACO manages to maintain a robust convergence accuracy. Therefore, MHWACO demonstrates robust stability across different functions and dimensions, displaying resilience to the influence of problem dimensions. This resilience stems from calculating transfer probabilities for different individuals based on positive feedback effects, selecting global optimization or local optimization. This allows individuals with high information concentration in the population a greater probability of achieving high-precision convergence, while individuals with low information concentration have a higher probability of global optimization. Throughout this process, unoptimized points may still exist. In such cases, the FA strategy is applied to these individuals, endowing them with energy to escape local traps. Additionally, when the activity level of individuals in the population is low, the GSS is employed to enhance the algorithm’s ability to balance global exploration and local optimization. Despite improvements in the results of ACO1, ACO2, and ACO3 variant refinement strategies, these three variants notably lag behind MHWACO in convergence accuracy on benchmark functions, especially in functions like f5, f7, f9, and f10, where the optimization differences are substantial. In summary, the outcomes derived by MHWACO are not contingent on a singular refinement strategy. Instead, these three strategies complement and synergize with each other, forming a comprehensive system.
4.3. Comparison between MHWACO and EA’s Variants
To further substantiate the algorithm’s sophistication, this section will juxtapose MHWACO with distinguished variants of EAs. These algorithms, namely MSWOA [
34], MWOA [
35], LFDE [
36], NHWOA [
44], EWOA [
29], COSMA [
11], EAPSO [
13], MELGWO [
12], ASFSSA [
9], represent recent advancements in the field, originating from highly influential journals with commendable citation rates.The comparative results are illustrated in
Table 7 and
Table 8. The progression of iterations and the visual contrast of effects are exemplified in
Figure 4 through the presentation of box plots.
From the data presented in the table, it is discernible that MHWACO consistently yields results comparable to or superior to current mainstream algorithms across various test functions and dimensions. Regarding stability, the algorithm’s average optimal values exhibit negligible variation with an increase in dimensionality, and the standard deviation values remain relatively stable, showcasing the algorithm’s exceptional stability and robustness. In terms of global search capability, thanks to the formidable leaping ability conferred by FA, the algorithm adeptly escapes local traps, enabling it to locate local optimal solutions even in multi-modal functions, highlighting its formidable prowess in global search. Furthermore, the algorithm demonstrates high operational efficiency, swiftly converging to near-optimal solutions within a short timeframe. However, akin to the majority of intelligent optimization algorithms, this algorithm cannot be universally applied to address all function optimization problems flawlessly. For instance, in the case of the f12 function, its efficacy is relatively lower than that of ASFSSA, with a minimum value that is three orders of magnitude lower at a dimensionality of 30. Interestingly, when the dimensionality increases to 100, MHWACO’s algorithmic precision shows no significant reduction. In comparison, ASFSSA experiences an eight-order-of-magnitude decrease, indicating a lower stability compared to the algorithm proposed in this paper. Thus, this demonstrates MHWACO’s commendable performance in general optimization problems, establishing it as a promising algorithm.
From the figure, it is discernible that MHWACO can yield solutions comparable to or even superior to the refined algorithms. Across 34 function outcomes, MHWACO consistently demonstrates stability, as shown in the box plot. In comparison, in terms of the iterative performance of certain algorithms, MHWACO’s iteration speed may lag behind the LFDE and EWOA algorithms. This disparity arises because the probability of utilizing the GSS strategy in the early stages of the algorithm is relatively low. While the GSS strategy purposefully converges toward the optimal solution, its computational duration significantly increases with escalating dimensions. Therefore, to balance algorithmic complexity with optimization efficacy, this study configures the activation of the GSS strategy only when population activity is low, aiming to uncover latent solutions to the problem. However, the presence of this strategy substantially enhances the algorithm’s robustness. Upon scrutinizing the box plot constituted by 30 optimization results, it becomes evident that MHWACO’s optimal, worst, and average values are consistently within a stable range. In contrast, the performance of other algorithms tends to have wider box plots due to a higher prevalence of outliers, underscoring the heightened stability and robustness of the algorithm proposed in this paper.
To scientifically validate the sophistication of the proposed algorithm in this manuscript, a Friedman test was conducted. As depicted in
Table 9, the experimental results of all the enhanced algorithms, except for the ASFSSA, exhibit significant distinctions from MHWACO. Notably, the performance of the EWOA is suboptimal in the case of the f5 function, yet the algorithm converges to the optimal value for the f6 function. Conversely, the COSMA ranks poorly in the results for the f6 function but achieves favorable outcomes for the f5 function, thereby illustrating the divergent effects of different algorithms across various functions. Concurrently, it is evident that both MSWOA and EAPSO experience a noticeable deterioration in convergence as dimensions increase, whereas MHWACO adeptly addresses the challenge of dimensionality escalation. In summary, compared to the algorithmic variants contrasted in this paper, MHWACO demonstrates elevated robustness and efficacy.
4.4. Comparison of MHWACO and Novel Algorithms
To substantiate the superiority of MHWACO, in this section, we shall juxtapose MHWACO with nine contemporary and widely-cited novel optimization algorithms proposed in recent years. These algorithms include GWO [
7], WOA [
8], AVOA [
3], GTO [
4], DBO [
13], SMA [
14], AOA [
5], GJO [
15], and POA [
15]. CEC2022 has a choice of 2, 10, and 20 dimensions, and in this paper we will compare the results of the algorithm for dimensions 10 and 20. The comparative results are presented in
Table 10.
The proposed algorithm undergoes a comparison with 10 recently proposed intelligent optimization algorithms and 9 enhanced algorithms across 34 benchmark functions and the CEC2022 test suite. From
Table 10, it is evident that, across different dimensions (10, 20), MHWACO consistently converges to favorable outcomes. In comparison, the SMA and the GTO algorithm exhibit robust competitiveness, especially excelling in f1 and f8. Despite MHWACO’s less-than-ideal results in these two functions, its relative error is not significantly pronounced compared to the SMA and the GTO algorithm, remaining within a similar exponential level. Additionally, as the dimensionality increases, MHWACO’s solution accuracy does not exhibit a noticeable decline. In contrast, the SMA, in the f11 function, experiences a deviation of 4000 in the optimal value when transitioning from 10 to 20 dimensions. Meanwhile, with increasing dimensions, MHWACO’s convergence efficiency continues to improve, converging to feasible solutions in the early stages of iteration, thanks to the superior GSS strategy.
Merely relying on average values and standard deviations to illustrate differences in convergence effects among algorithms is insufficient. This study employs Friedman’s test to further confirm that MHWACO outperforms other algorithms, as shown in
Table 11. According to rankings, MHWACO, SMA, and GTO are the top three algorithms. A noteworthy observation emerges from the two charts, SMA holds the top position in the overall rankings at a dimension of 10, while MHWACO claims the first position as the dimension increases to 20. This indicates that the proposed algorithm is less affected by dimensionality, consistently yielding favorable results as dimensions increase. In summary, MHWACO’s convergence performance reveals competitive or superior optimal solutions compared to the contrasted algorithms.
4.5. Comparison of MHWACO on High Dimensions
To further demonstrate the prowess of MHWACO in tackling high-dimensional optimization challenges, we juxtaposed the performance of MHWACO against variants of other evolutionary algorithms across dimensions of 200, 500, and 1000 in this section. The comparative results are presented in
Table 12 –
and
Table 13–
.
From the table, it is discernible that MHWACO can obtain solutions in 25 functions comparable to or better than those obtained by the comparative algorithms. Additionally, MHWACO can maintain convergence results in the majority of functions across dimensions of 200, 500, and 1000, demonstrating the algorithm’s commendable scalability. However, several of its variants exhibit a noticeable decline as dimensions increase, notably LFDE, with a maximum precision error exceeding 20 orders of magnitude in multiple functions. In contrast, the COSMA and ASFSSA demonstrate robust competitiveness, with their mean and standard deviation outperforming MHWACO in the f5 function. Nevertheless, in the performance on other functions, their experimental results exhibit significant disparities from MHWACO. This discrepancy arises because the FA strategy in MHWACO aids the algorithm in finding feasible solutions and escaping local traps, while relying on the GSS strategy to ensure the algorithm generates different search tasks in various search stages. Extensive experimentation reveals that, while the algorithm can ascertain superior solutions in the later stages of iteration, due to the substantial scale of high-dimensional problems and the closely related complexity of GSS to the dimensionality of the solution space, a balance between computational complexity and algorithm precision is struck. In solving high-dimensional problems, the algorithm is deemed stagnant if the optimal solution remains unchanged for over 50 iterations in a row, considering it has found a feasible solution to the problem, thus halting the algorithm.
5. The Application of MHWACO in Prediction Problems
The realm of intelligent optimization algorithms, serving as a genre of search and refinement techniques, furnishes efficacious tools and methodologies for the training and optimization of machine learning algorithms. When confronted with intricate machine learning predicaments, conventional optimization methods may find themselves constrained by local optima or the challenges posed by high-dimensional data. Intelligent optimization algorithms, by emulating the behavioral patterns of biological populations in the natural world and leveraging other metaheuristic strategies, adeptly tackle these challenges. Through the application of intelligent optimization algorithms, machine learning models gain an enhanced adaptability to diverse data features and structures, thereby elevating predictive accuracy and robustness. These algorithms, by fine-tuning hyperparameters and optimizing model architectures, augment the model’s generalization capacity, mitigate overfitting risks, and enhance interpretability. The flexibility and scalability of intelligent optimization algorithms make them pivotal tools for addressing complex data analysis and prediction tasks.
To ascertain the practical applicability of MHWACO, this paper employs it for optimizing parameters in machine learning algorithms. In recent years, intelligent optimization algorithms have found widespread application in various facets of machine learning, encompassing anomaly detection, feature selection, and parameter optimization. In
Section 5.1, the algorithm is applied to the parameter optimization of Support Vector Machine (SVM); in
Section 5.2, it is utilized to optimize random forest (RF).
The datasets chosen in this paper are presented in
Table 14. These five classical datasets have found widespread applications in the realms of machine learning and data science. Firstly, the Boston Housing dataset encompasses information about housing prices in the Boston area, including various features such as crime rates and education levels, laying the foundation for the study of housing price prediction. The Stock dataset comprises authentic stock data extracted from the Shanghai Composite Index of the Shanghai Stock Exchange, spanning from 1 January 2021 to 1 January 2023. The Diabetes dataset is employed to explore the relationship between physiological characteristics of diabetic patients and the progression of the disease, featuring attributes like age, gender, and body mass index. The Forest Fire dataset aims to predict the scale of forest fires, incorporating detailed information such as coordinates, occurrence time, and meteorological conditions. Lastly, the Air Quality dataset provides information about urban air quality, including concentrations of carbon monoxide, nitrogen dioxide, and temperature, contributing to environmental monitoring and policy formulation. These datasets not only offer profound insights for in-depth studies in relevant domains but also establish standardized benchmarks for the training and evaluation of machine learning algorithms. These datasets originate from the regression datasets in Scikit-learn in PyCharm and are provided by the University of California, Irvine. In this section, we will fully leverage these datasets to further affirm the algorithm’s utility in solving problems related to stock prediction, housing prediction, disease prediction, fire prediction, and air quality detection, concurrently evaluating its performance in tackling real-world challenges. The study utilizes MHWACO to optimize the PCA-reduced data and compares its performance with the improved algorithm MSWOA [
33], MWOA [
34], LFDE [
35], EWOA [
28], COSMA [
11], EAPSO [
12], MELGWO [
7], and ASFSSA [
9].
To evaluate the effectiveness of the proposed algorithm, this study utilizes five metrics to assess the quality of the resulting solution.
1. Mean Bias Error (MBE):
The MBE measures the average bias between predicted values and actual values. The MBE quantifies the average bias between the predicted values
and the actual values
. The formula is expressed as the mean of the differences over the number of samples
n:
2. Coefficient of Determination ():
The
assesses the model’s explanatory power by comparing the sum of squared differences between predicted and actual values to the total variance. The formula involves the mean of actual values
, and the number of samples
n:
3. Mean Absolute Error (MAE):
The MAE assesses the average absolute difference between the predicted and the actual values:
4. Explained Variance Score (EVS):
The EVS evaluates the model’s capability to explain variance, utilizing the actual values and the predicted values:
5. Median Absolute Error (MEAE):
The MEAE measures the median absolute difference between predicted and actual values.
5.1. MHWACO Applied to SVM
The SVM, a machine learning model employed in supervised learning, operates on the fundamental principle of defining a decision boundary using training samples. Simultaneously maximizing the margin between the boundary and the training samples, an SVM aims to discover the optimal hyperplane capable of effectively separating data of different classes, ensuring accurate classification of new samples. Demonstrating excellence in handling high-dimensional and non-linear data, an SVM exhibits robust generalization capabilities. Extending an SVM to regression tasks yields Support Vector Regression (SVR) [
47]. Diverging from traditional regression methods, SVR represents a non-linear, non-parametric regression technique. Its core idea is to identify a function minimizing the difference between the function values and actual observed values within a specified tolerance range. Commonly employed kernel functions include linear, polynomial, and Gaussian radial basis kernels. These kernels facilitate mapping data to higher-dimensional spaces, transforming originally linearly inseparable data into linearly separable forms, thereby enhancing predictive efficacy.
This paper employs the MHWACO algorithm for optimizing the parameters’ gamma and the penalty factor C in the kernel function of SVR. It is essential for the value of gamma to be greater than 0. As gamma increases, the classification performance on the test set deteriorates while improving on the training set. Additionally, it elevates the model’s complexity, resulting in poor generalization ability and potential overfitting. Regarding C, any positive value can be selected based on requirements. A higher C indicates a greater emphasis on the overall error during the optimization process, demanding a more substantial reduction in errors, even at the expense of reducing the margin. In the process of algorithmic fine-tuning, the evaluation metric Mean Squared Error
is employed to further refine parameters; it measures the average squared difference between predicted and actual values, enhancing the algorithm’s efficacy. Throughout the manuscript, a consistent population size of 30 is maintained, as well as an iteration count of 50; 10 repetitions of the experiments are conducted, and a training-to-testing dataset split ratio of 0.8 is adhered to. The comparative results are presented in
Table 15.
Among these evaluation metrics, namely MAE, MBE, and MEAE, a proximity to zero is indicative of superior performance, while for the metrics and EVS, a closeness to one signifies excellence. Upon scrutinizing the charts, it becomes evident that MHWACO exhibits unparalleled performance in the Stock and Diabetes datasets, securing the top position with all five metrics ranking first. In the Boston dataset, the performance of MBE is slightly lower than LFDE, where LFDE’s negative MBE value suggests the model’s tendency to underestimate actual observations. Conversely, MHWACO’s positive MBE value indicates an overall overestimation of actual values. Among all the comparative algorithms, the most competitive algorithm is EAPSO, demonstrating superiority in all three metrics over MHWACO in the Air Quality dataset. However, the absolute difference is not exceeding 0.03, providing substantial validation for the effectiveness of MHWACO in addressing SVR problems.
5.2. MHWACO Applied to RF
RF stands as an ensemble learning method crafted to address classification, regression, and diverse tasks. Rooted in decision tree construction, a random forest comprises multiple decision trees. Each tree independently samples input data, and their collective predictions amalgamate into the ultimate forecast. Throughout the construction of each tree, RF randomly selects feature subsets, contributing to variance reduction in the model. This strategy effectively mitigates the risk of overfitting compared to a single decision tree. Moreover, it adeptly manages voluminous input features while maintaining a degree of tolerance to noise in the training dataset. Renowned for its prowess, random forest excels across various domains, encompassing medical diagnostic [
48], and beyond.
This manuscript will employ MHWACO for optimizing the hyperparameters of RF to enhance the model’s performance. Firstly, the maximum depth of each tree, referred to as ’max depth’, serves to control the depth of the trees, preventing overfitting, yet care should be taken to avoid excessively small values to mitigate underfitting. Secondly, the quantity of trees in the random forest, a parameter governing the number of trees in the ensemble. Increasing the number of trees typically enhances performance, though at the expense of higher computational costs. The parameter configurations in the text align with those outlined in
Section 5.1. The comparative results are presented in
Table 16.
As depicted in the illustration, MHWACO surpasses the comparative algorithms significantly when handling the Forest Fires dataset. In this algorithm, the indicators of other algorithms are either negative or small values, indicating that the comparative algorithms are experiencing model overfitting when dealing with this dataset. This, in turn, leads to poorer predictive performance on the test set. In the realm of RF, the most competitive algorithms are the MWOA and LFDE, showcasing commendable results in the Boston and Diabetes datasets. However, their predictive performance in other datasets is comparatively lackluster, suggesting significant variations in the effectiveness of the comparative algorithm when dealing with different datasets. By employing multiple metrics to contrast MHWACO with the improved algorithms across various datasets, this validates the robustness of the proposed algorithm in optimizing RF parameters.
6. Conclusions and Future Work
This paper introduces a framework to address large-scale optimization problems, focusing on optimizing regression model parameters for predictive analysis. By employing Gaussian initialization, individual initial positions achieve a more extensive coverage of the solution space. At the onset of iterations, the integration of transition probabilities into the algorithm’s global and local searches guides the purposeful movements of individuals. Subsequently, individuals trapped in local optima are endowed with the capability to escape, further enhancing the algorithm’s global exploration. Considering population activity comprehensively, the need for the GSS is assessed to balance the algorithm’s global search and precise optimization capabilities. Towards the conclusion of iterations, the update of pheromones directs subsequent individuals towards effective directions. This paper compares MHWACO with 9 novel algorithms and 9 enhanced algorithms across 34 benchmark functions and the CEC2022 test suite, effectively demonstrating the advanced nature and stability of the proposed algorithm. Additionally, the comparative analysis of computation time is not shown in the experimental part because the running time of the proposed algorithm is equivalent to or slightly worse than that of other algorithms under the same population size and evolutionary iterations, and the calculation time is acceptable.
We believe in the immense potential and expansive prospects of EAs in addressing machine learning challenges. Future research endeavors will continue to explore enhancements to ACO and its applications in other domains. Specific improvement strategies include, firstly, employing effective tactics to reduce the algorithm’s complexity, thereby enhancing operational efficiency and interpretability. Secondly, adjusting strategies like the GSS can be used in the algorithm to achieve a higher alignment with issues such as feature selection, transfer learning, and data augmentation, addressing complex real-world problems. As intelligent optimization algorithms continue to evolve and innovate, the field of machine learning will encounter more opportunities and challenges. The introduction and application of improved EAs will propel significant achievements in the application of machine learning across various domains.