5.1 Resource Allocation Metrics
The first Research Question (RQ1) to be answered refers to the most used metrics for resource allocation in a fog computing environment. A metric is almost always linked to a criterion, giving rise to the allocation objective, which in turn is indicated in terms of QoS [
144]. For example, the metric “cost” is linked to the criterion of “reduction”, generating the objective of “minimizing the cost”.
Analyzing the selected publications, the following resource allocation metrics were observed: resource utilization, cost, latency, energy, user experience, and execution time. They are detailed in Table
2, which also gives the percentage of the analyzed papers that each metric represents.
From the analysis of the publications presented in Table
2 and answering RQ1, it was possible to identify that the most addressed metric in resource allocation is cost, in 31 of the 108 publications (28.7%). In fact, reducing cost is an objective that can intersect with several other metrics, such as reducing energy consumption, reducing latency, or even reducing execution time. In this sense, it was noted that some papers appear in two different classifications [
45,
48,
66,
78,
80,
90,
98,
102,
107,
109,
138,
155,
161,
168,
174,
175,
187,
188] which is possible when the authors of these papers combine different metrics in their proposals. Etemadi et al. [
56], for example, proposed a resource allocation model that combined three metrics - cost, latency, and resource utilization - aiming to reduce the total cost and delay violation, and increase the fog node utilization. When grouped, these metrics become even more relevant for the resource allocation field in fog computing, as they optimize resource utilization, thus avoiding the waste of computational power in fog computing since the fog nodes usually offer limited resources [
135].
Reduce latency is also a very significant metric in fog computing when compared to other computing paradigms, such as cloud computing [
33], and this was confirmed by the results when they presented this metric as one of the most used in the analyzed articles (24.1% of them). The demand for fog computing is usually linked to the need to reduce application response time and latency has a direct impact on this. In this way, reducing latency is a metric that can be considered inherent to the resource allocation process in fog computing, and must be achieved by all resource allocation proposals, since this is an essential feature of this paradigm [
67].
In addition to the metrics listed in this section, some other metrics could be proposed. The allocation time, for instance, is considered a relevant metric. This is because this metric refers to the time needed to receive the workload information, estimate, and verify the necessary resources and make the allocation effective. Considering the mobility and dynamics of fog applications, an optimized allocation time plays a fundamental role in ensuring a high user experience and in achieving the required service quality [
75].
Other metrics that could be used are related to the time and number of migrations between the fog and cloud nodes, as an efficient allocation should keep the quantity of migrations as low as possible. The use of predictive algorithms is proposed, as they can assess how stable the allocated resources are, aiming to ensure that the resources will be available until the end of workload execution.
5.2 Resource Allocation Techniques
RQ2 asks which are the most used techniques in the analyzed papers for resource allocation in fog computing. Table
3 shows the techniques used in the analyzed papers, grouped in
Integer Linear Programming (ILP) /
Nonlinear Programming (NLP), Heuristics, Meta-heuristics, Fit-based approaches,
Multiple Criteria Decision Making (MCDM), Game-based approaches, and Machine Learning.
Linear Programming consists of methods to solve optimization problems that have some restrictions (injunctions), the objective function being linear in relation to the control variables and the domain of these variables is composed of a system of linear inequalities [
205]. The main advantage of Linear Programming is the flexibility to analyze complex problems [
159]. A Linear Programming approach was used in the papers: [
15,
40,
59,
61,
104,
111,
133,
158,
160,
167,
170,
172,
179,
183,
192,
195,
201], that represent 15.7% of the total analyzed papers. It is important to highlight that most of these proposals aim to meet only one metric (i.e., reduce latency in [
158]), since this is the result of the linear objective function. In three other papers (2.7% of analyzed papers) [
17,
110,
161] a
Mixed Integer Linear Programming (MILP), which is an extension of ILP for when some decision variables are not discrete, was used to solve the resource allocation problem.
Nonlinear Programming, on the other hand, is the process of solving an optimization problem defined by a system of equalities and inequalities, called restrictions, over a set of real variables whose values are unknown, with an objective function to be maximized or minimized, and where some of the constraints or the objective function are non-linear [
30]. This type of approach was used to address the resource allocation problem in fog computing in the seven publications: [
15,
58,
70,
104,
111,
183,
201]. Among these works, Fan et al. [
58] used, in addition to Nonlinear Programming, a Markov Decision Process technique to optimize the results for the resource allocation process in their proposal.
Heuristics were the most used techniques in the analyzed papers. In this type of solution, decisions are based only on the information available, without worrying about the future effects of such decisions, thus making the locally optimal choice at each stage of execution. The goal is to find a good, and not necessarily optimal, global solution. This type of approach suits the fog computing model as it supports the dynamics of the environment well, resulting from the essential characteristics of high geographic distribution, heterogeneity, and interoperability. Therefore, they are considered easy to implement and efficient [
34]. In the analyzed papers on resource allocation in fog computing, some authors (12%) proposed new heuristic algorithms, as in [
27,
63,
73,
79,
88,
102,
115,
122,
156,
177,
187,
193,
202]. Also some well-known heuristics methods were used, such as the Price-based approaches (10.1% of total) [
1,
2,
3,
4,
5,
86,
92,
117,
140,
142,
203], Greedy algorithms (2.7%) [
32,
82,
114], the Lyapunov optimization approach (2.7%) [
8,
38,
109], and the Hungarian algorithm (0.92%) [
185].
Similarly, meta-heuristic techniques combine basic heuristics at a higher structure level, but also aim to find an optimal or near optimal solution in a limited execution time, which is very relevant in fog computing environments given its dynamism and mobility characteristics. Within this category are Evolutionary Algorithms, which are based on the principles of natural evolution, maintaining a population of candidates for the solution throughout the research [
53]. After initialization, new solutions are generated iteratively, selecting good solutions from the population, crossing, and mutating. New individuals are also evaluated and inserted into the population, usually replacing the worst solutions. The algorithm is normally interrupted after a certain number of iterations, returning the best solution found in that period [
53]. Evolutionary algorithms for resource allocation in fog computing environments were used in seven papers, using the following algorithms: Elitist Selection Strategy [
138] and [
148], Pigeon Inspired Optimization [
16], Weighted Sum Genetic Algorithm [
72], Hungarian Algorithm [
10], Directed Acyclic Graph [
175], and Estimation of Distribution Algorithm [
186]. Besides this, other meta-heuristic techniques were found in the reviewed papers, such as the Particle Swarm Optimization algorithm used in [
66] and Ant Colony Optimization adopted in [
204].
Considering the Fit-based approaches, the First-Fit algorithm was used in three papers [
141,
152,
169]. In this technique, the allocation problem is solved by providing the first resource that delivers the requested parameters, regardless of whether there were better options. In this sense, the Shortest Job First algorithm used in [
90] prioritizes the allocation of the smallest requests. Only the Best-fit algorithm presented in [
190] looks for the best allocation considering the inputs and the available resources. Although these proposals are valid for some fog computing scenarios, they may be ineffective in environments that have high demand, that is, a large number of requests from IoT devices, for example.
With regard to MCDM techniques, in [
129] the authors used the
PROMETHEE method
(Preference Ranking Organization Method for Enrichment Evaluation), while in [
107] and [
112] the authors used the
ELECTRE method
(Elimination Et Choice Translating Reality). The method called AHP was used in [
51,
54,
130,
182]. Although they were able to meet some established QoS criteria, the methods were not able to guarantee that the minimum requirements were met. The same occurs with the method known as TOPSIS, which was used in [
23,
25,
26,
83], and is also limited in its ability to achieve refined delivery quality.
The game-based approach uses mathematical models to make optimal decisions under conflicting conditions. With that, a basic element is the set of players who participate in it, and each player has strategies. The choice of a strategy reflects a situation among all possible situations. Each player has an interest or preferences for each situation in the game [
6]. Between the analyzed publications, the Stackelberg game theory [
143], in which the leader moves first, and the other players move in sequence was the most used (2.7% of total papers) [
80,
93,
199]. Unlike proposals based on MCDM, these techniques are able to better find a way around the variations and restrictions of the fog computing environment, and therefore achieve interesting results in the resource allocation process for this paradigm.
Finally, Machine Learning techniques involve algorithms that aim to learn the fog environment, the users, and the resource allocation behaviors in order to predict new requests. In this context, there are Deep Learning techniques [
188,
198], Deep Reinforcement Learning [
68,
69,
98,
106], Bayesian learning [
56,
168], and Fuzzy Logic [
64,
155], that goes beyond the limits of Machine Learning and enters the Artificial Intelligence field [
196]. Analysis of the results obtained in this survey shows that there has been an increase in the use of these techniques in recent years for proposals for resource allocation in fog computing, since 70% of them were proposed in the last 3 years. Although they may require greater computational power for their execution, given the need to process historical series and large volumes of data, they are more assertive in choosing the best resource to be allocated, even when considering different input parameters.
5.3 Covered Architecture Layers
Analysis of which layers of architecture are considered in resource allocation approaches in fog computing is necessary to answer research question RQ3. Most of the analyzed papers considered the fog architecture divided into three layers (IoT, Fog, and Cloud) in their proposals, as presented in Section
2. Therefore, the resource allocation in fog computing can be considered as a double correspondence problem [
65]. Approaches to solving this problem may involve only the Fog Layer, or the communication between two (IoT x Fog or Fog x Cloud), or even three layers (IoT x Fog x Cloud) of the fog architecture. The publications are classified in these scenarios in Table
4.
There are approaches that consider only the link formed between end-user devices, located in the IoT Layer, and devices in the Fog Layer. Once this link is established, they will connect to the Cloud Layer as a single resource group. Although these approaches do not disregard the existence and the relationship of the Fog Layer with the Cloud Layer, they seek alternatives to address all resource allocation problems in the Fog Layer, avoiding forwarding requests to the Cloud Layer.
Some analyzed publications focused on the connection formed between the Fog Layer devices and the Cloud Layer ones. The resources made available through this relationship allowed end users to run their workloads on them. The papers in this context consider that the services required by end-users, who are in the IoT Layer, have already been distributed in the Fog Layer and based on this premise, need to be provisioned using the resources available in that layer and in the Cloud Layer.
There are also some papers that apply their proposals considering only the Fog Layer. In this scenario, all requests must to be solved by the resources available in the fog environment. However, this is not usual since fog computing is complementary to the cloud and not a paradigm to replace it.
Finally, most works analyzed in this article had their proposals addressed using all three layers of fog computing architecture. This is the most common approach because it considers that the Fog Layer is only an intermediate layer to reach the objectives defined (for example, improving QoS). The requests are generated in the IoT Layer and are fully met using not only the Fog Layer, but also Cloud Layer. This makes sense since, as indicated in the definitions of fog computing presented in Section
2, this paradigm is intended to complement and not to substitute the cloud computing. In total, 53 out of the 108 analyzed publications (49%) considered all three fog layers to develop their proposals.
5.4 Virtualization Models
RQ4 aims to identify the virtualization models used in resource allocation approaches in fog computing. It is important to emphasize that in fog computing the availability of resources (such as processing, memory, storage, and network) is essential. Unlike cloud computing, where resources are always available, in the fog there is a strong constraint on resources, as they are often devices with low computational capacity. A switch, for example, has the main function of managing network connections, but it is used by the fog layer to provide its unused computing resources for processing, storage, and so on. Given the above, the virtualization models used in the analyzed studies can be grouped into two categories, as presented in Table
5 and discussed below.
The
Virtual Machine (VM) concept is widely used, as it exploits virtualization at the hardware level so that multiple operating systems can run independently on a single physical resource. VM instances are executed on an abstraction layer called hypervisor, which allows the sharing of hardware between different instances [
120]. The container is a type of virtualization which is lighter when compared to virtual machines and offers virtualization at the operational level [
120]. Containers isolate processes with just the necessary application packages and are highly portable across multiple nodes of fog computing. In [
193] the authors present some advantages in the use of containers when compared to virtual machines, as follows: containers start faster than VMs because hypervisors are not required, containers are better than VMs in terms of performance, and the greater the number of VMs deployed on a server, the higher the performance degradation of the server.
The use of an adequate virtualization model is fundamental for the performance of the application and the achievement of the objectives indicated in the QoS [
19]. Undoubtedly, the use of VMs can be the best option in some use cases, such as those that require greater isolation of the application or service. However, according to the analyzed articles, the use of containers is more adherent to the resource allocation field in fog computing, since it is lighter and more dynamic compared to virtual machines, favoring mobility, and better adapting to resource constraints, which are characteristics of this computational model [
125].
In two of the analyzed papers [
115] and [
193] both virtualization models - virtual machine and container - were used to better apply the resource allocation method proposed by them. Finally, none of the analyzed papers uses unikernel, which is a virtualization model that is even more lightweight than containers [
118], which is a gap to be exploited in new proposals to achieve an efficient resource allocation.
5.5 Fog Computing Proposals Evaluation
Simulation tools and models are used to evaluate the work to bring the system closer to the real environment. A model is a representation of an actual or planned system [
173]. Simulators are used to study the behavior of the system and understand the factors that affect its performance as it evolves over time [
124]. Simulation frameworks provide solutions in cases where mathematical modeling techniques are difficult or impossible to apply due to the scale, complexity, and heterogeneity of a fog computing system [
173]. Simulation is a way to imitate the operation of real systems, with the freedom to modify the inputs and model a series of characteristics, analyze existing systems, or support the design of new ones. It helps identify and balance the cost [
24].
Some simulators that were already used to validate studies in cloud computing have been adapted for fog computing. In addition, new simulators have been specifically designed to meet the demand of fog computing. A detailed analysis of several simulators for fog computing was presented in [
124]. This section aims to answer RQ5, which addresses the way that the proposed approaches were evaluated. An analysis of the simulators that were used in the selected publications is presented in Table
6.
The majority of analyzed papers used numerical simulators to validate their proposals. This type of simulation is used to study the behavior of systems whose mathematical models are too complex to provide analytical solutions, as in many non-linear systems. This is the situation found in many proposals that addressed allocation resources in fog computing. The most common simulator was Matlab [
163], used by 45 of the total of 108 publications, that is, about 42%.
CloudSim [
36] was proposed to simulate cloud computing services. It is a library for cloud computing simulation developed in Java language, where each entity is represented as a class. Therefore, most of the works that used CloudSim presented a solution with integration with computing cloud environments, justifying the use of this simulator. An extension of the CloudSim simulator is iFogSim [
74]. This simulator allows one to model IoT and fog environments to measure the impact of the proposed techniques for resource management in terms of latency, network congestion, energy consumption, and cost. Considering that it was only presented in 2017 [
74], and considering the time required for its maturation and greater use, this simulator has come to be used more recently by academics.
In a less representative way, some analyzed studies used other simulators to evaluate their proposals. The GridSim simulator [
35], which allows the modeling and simulation of application models for grid computing, was used in [
131]. Finally, about 27% of the publications were evaluated in test environments built specifically to validate the paper’s proposal. In this type of test, all the software and the hardware are configured in a stand-alone way, using synthetic data sets.
The predominance of the use of simulators and mathematical models can be seen as a weakness in the evaluation of the proposals for the resource allocation process, since these mathematical models can hide unexpected behaviors of fog computing, mainly when considering the heterogeneity and mobility characteristics of these environments.
5.6 Fog Computing Domains
From 108 analyzed papers, just 20 of them (18%) indicated a specific domain their proposals are applied to. Thus, to address RQ6, these domains are detailed in Table
7.
In recent years, there has been growing attention to systems that support the development of vehicular networks. This is because vehicles have been increasingly equipped with powerful on-board computers, large capacity data storage units, and more advanced communication modules to improve safety, convenience, and driving satisfaction [
103]. These vehicles must be able to calculate, store, and communicate with other vehicles or devices. The features and benefits of fog computing are totally adherent to this type of service, as vehicles have high mobility and some services, such as autonomous vehicles, and require a very low response time to be effective and safe. For this reason, resource allocation proposals for this type of domain must be prioritized by the execution time.
A trend in the health area is the use of
Medical Cyber-Physical Systems (MCPS), which allow a continuous and intelligent interaction between computational elements and medical devices (e.g., heart-rate monitors) [
70]. However, considering the complexity and high quality of required services, these devices need low latency and other criteria for communicating with the cloud computing platform. Therefore, fog computing is a promising approach for the use of these resources, because the proposals for resource allocation that used this domain have focused on the latency reduction metric.
As fog computing is closer to IoT devices, it is widely used in smart cities, buildings, and industry projects [
194]. A smart building is one that is responsive to the requirements of occupants, organisations, and society. It also needs to be sustainable (energy and water consumption), healthy (well-being of the people living and working within it), and functional (user needs) [
42]. A smart building use case was utilized in the papers [
26], [
66], [
161], and [
90] to address the resource allocation proposal. Like smart building, smart manufacturing is also a use-case that can take advantage of fog computing benefits, relying on the characteristics of high geographic distribution and heterogeneity of fog devices to seek an optimized resource allocation for the execution of applications and services.
Finally, Virtual Reality was used in two analyzed papers to explain the resource allocation use. As the need for these types of applications that require low latency is increasing, it is expected that new use-cases for fog computing will appear in the next years. The papers analyzed in this survey that used this domain have focused their proposals for resource allocation on latency reduction and on resource utilization metrics.