1 Introduction

A distributed system is a set of independent elements that work together to accomplish a common goal. As stated by [1], Cloud Computing (CC) can be referred to as a form that originated from distributed computing and has revolutionized the industry of Information and Communication Technology (ICT) by introducing the concept of on-demand availability of computing resources. Because of the significant development in the terms of capabilities of the technology, computational resources have become easily available. This advancement in technology has led to the emergence of cloud computing wherein the resources are provided to multiple users on an on-demand and sharing basis. Cloud Computing is considered one of the most important computing paradigms in IT sector. Based upon the internet technology, the emergence of CC has provided services to the applications that are compute-intensive. To provide the compute resources on-demand over the internet located at a remote data center, the cloud providers share a pool of resources that can be accessed from any location in the world. Applications and data are stored in a data center. The services are provided by the cloud service providers to the users via cloud data centers. The workload of the data centers is heterogeneous and thus, proper provisioning of the resources is required to provide good quality of service to the users. Cloud computing is majorly dependent on its functioning on a technology named virtualization. Underutilization of mainframe computers led to the development of Virtualization technology by IBM in 1960 to make the most out of hardware resources. With the help of virtualization, multiple virtual machines can run over a single host machine. The concept of Virtualization enables the end-users and the service providers to have efficient utilization of cloud resources with optimum usage and least cost [2]. This technique is responsible for effectively handling the increasing need of the users in terms of needed resources in Cloud Data Centers (CDCs). Various objectives like balancing of load, energy management, the sharing of resources among multiple users, making the system fault-free, can be achieved with the help of virtualization [3]. With this advancement in technology, there is a great pace escalation in the number of users requesting resources in CDCs. A great pace escalation in the number of users requesting resources in CDCs has exponentially raised the power consumption making the network operation costly. This ever-increasing process to resource ratio results in degradation of network performance and increased energy consumption. Data centers are one of the major contributors to worlds power and energy consumption. Further, due to the outbreak triggered by the novel coronavirus, the organizations have suspended office work and suggested that employees work from home. This has led to an enormous increase in the use of cloud computing services. It is predicted that the information technology (IT) sector will consume up to 13% of global electricity by 2030, which is currently 7%. This percentage of energy consumption is increasing at the rate of 12% every year. According to the analysis done in 2020, 60% of the total data traffic was consumed by online shopping, gaming, and streaming, which is also forecasted to raise to 80% in the next five years [4]. The increased energy consumption is not only due to the electronic devices involved in the cloud system but also because of inefficient utilization of resources. Improvement in cloud resource utilization can lead to the minimization of this energy consumption. Here, the dynamic VM consolidation comes into play to efficiently reduce the amount of energy consumed and the carbon footprints [5] of the data center. Figure 1 shows the major reasons for energy inefficiency in the data centers. Server utilization is represented in terms of the ratio of the number of resources used to the total resource capacity of the server. For example, the current utilization of the CPU of a machine divided by its maximum capacity of the CPU gives an estimate of the utilization of that particular machine. Over the last decade, the demand for computing resources has increased exponentially. This ever-increasing process to resource ratio has led to various performance issues [6]. The key problem with this growing demand for various cloud resources is the overutilization of resources which results in degradation of network performance.

Fig. 1
figure 1

Causes of energy inefficiency

Also, if the servers remain underutilized, they result in an exceptional increase in operational cost and power consumption. Even an idle server consumes 70% of its peak power [7]. This overutilization and underutilization of resources have resulted in low server utilization, which in turn, is a major reason behind the wastage of energy. This poses a great concern for the management of resources in the data centers. Across the world, many initiatives [8] like Climate Neutral Data Centre Pact and European Green Deal are being taken to make the data centers climate neutral. Numerous schemes have also been proposed and several frameworks have been developed by various authors aimed at augmenting the energy efficiency of the cloud computing environment. However, virtual machine consolidation has proved to be the most efficient of all these techniques owing to its ability to reduce energy consumption while trying to avoid the SLA violations [9]. Apart from energy consumption, some other important challenges in cloud computing include transportability of data over cloud, privacy and secrecy of the shared data, reliability of the services rendered by the providers, computing in low bandwidth environments and the quality of service [10].

1.1 Scope of the Survey

Cloud Computing is considered one of the most important computing paradigms in IT sector. Based upon the internet technology, the emergence of CC has provided services to the applications that are compute-intensive. It plays a very substantial role in this era of digitization. A large figure of surveys has been conducted to analyze multiple facets of cloud computing services. However, the current research focusses on in-depth and detailed review of the consolidation techniques based upon the consolidation step involved, the type of architecture involved, the experimentation environment used, different quality metrics achieved, classification of techniques on the basis of prediction methods employed, incorporation of metaheuristic techniques. The comprehensive survey performed also suggests various challenges and open issues. A taxonomy has also been developed considering the seminal work done by various researchers along with the proposal of a consolidation model. Various state-of-art surveys have been compared in Table 1 on the basis of the above-mentioned factors, which has not been done before.

Table 1 Comparison of state of the arts with the proposed work

2 Research Contributions

The key contributions of the current research are as follows:

  • A taxonomy focussing on various computational engineering-based solutions including statistical and optimization methods for VM consolidation has been presented.

  • Detailed analysis of the existing consolidation solutions based upon the architecture, predictive methods employed, resources utilized, objectives considered and the consolidation step involved has been presented in the research article.

  • Observing the research gaps, various future recommendations are suggested to ensure energy efficiency in the cloud computing environment.

  • Considering the research gaps, a model for VM consolidation based on has been proposed.

2.1 Methods

An appropriate methodology has been adopted for conducting this survey. For performing the detailed analysis, several research articles have been identified from high quality journals and conferences like IEEEXplore, Springer, Wiley, ACM and science Direct. For having recent facts and figures, government reports, articles and blogs have also been referred. The keywords like energy reduction, migration, sustainable cloud, energy conservation, efficient cloud data centers, dynamic VM consolidation, overloaded host detection, VM selection and placement have been used for performing the exploratory research.

2.2 Organization

The research article is divided into various sections and the organization of the same is shown in Fig. 2. Section 1 talks about the introduction to cloud computing, the scope of this survey, the contributions of this research article, the methods and materials used for performing this survey followed by the organization of the article. In Sect. 2, we present an overview of different hardware and software centric solutions for reducing energy consumption in cloud data centres, followed by the benefits of VM Consolidation, and the steps involved in the process of consolidation. Section 3 details about various schemes put forward incorporating different consolidation steps. Section 4 includes the analysis of the existing consolidation solution schemes in terms of architecture, resources, objectives and predictive models applied and development of a taxonomy based upon these solutions. The next section provides a glimpse of gaps in the existing literature followed by a proposed framework, with an attempt to overcome few of the challenges. The article has been concluded in Sect. 7.

Fig. 2
figure 2

Roadmap of the paper

3 Background

This section primarily focusses on the energy reduction methods that are available in cloud computing. The section is further bifurcated into three sections. The first subsection talks about the different techniques available for energy reduction. The second subsection details about the need of VM consolidation. Whereas, the steps involved in the process of consolidation have been described in the third subsection.

3.1 Power Saving Techniques for CC Environment

In literature, various techniques have been put forward to achieve the goal of energy efficiency in cloud data centres. These techniques have been categorized into two main categories: software centric techniques and the hardware centric techniques. The hardware centric techniques focus on providing the solutions based on infrastructure of the data centres and the IT components involved in it. This includes using variable speed fans, slower drives, cooling equipment etc. However, the software centric solutions make full utilization of the virtualization technology. The categorization of these power saving techniques is shown in Fig. 3.

Fig. 3
figure 3

An overview of different energy saving techniques [3]

3.2 Need of VM Consolidation

To deal with the hardware failure of the physical machines, the addition, and deletion of new virtual machines in the cloud resource management system, and to accomplish the task of serving the resource requests that keep evolving, the mapping and remapping of virtual machines across different hosts become unavoidable. Server Consolidation is a technique that deals with the above-stated situations. Consolidation aims to minimize the total number of hosts to house a given set of virtual machines, thereby increasing the energy efficiency of the cloud data center. Consolidation of virtual machines is a method of improving resource utilization, refining energy efficiency, and reducing power consumption. This is achieved by enabling virtualization. Virtualization exploits migration to achieve the following goals:

  1. (1)

    To configure virtual machines on a minimum number of physical machines.

  2. (2)

    To manage overloaded/underloaded hosts.

  3. (3)

    To turn off the idle nodes or keep them at low power hibernation mode.

These goals form the objectives of the dynamic consolidation of machines. With the help of the consolidation of virtual machines, the performance of the system is improved by migrating the virtual machines from one physical machine to another, aiming to balance the load. The intent is the avoidance of the usage of extra physical machines. Virtual machine consolidation is enabled using the technology of VM migration. An example of consolidation is shown in Fig. 4.

Fig. 4
figure 4

VM consolidation Example

Suppose there are 5 physical machines denoted as \(\left\{PM1, PM2, PM3, PM4, PM5\right\}\) each having the CPU utilization of \(\left\{50\%, 25\%, 40\%, 30\%, 10\%\right\}\) respectively. It can be seen that each physical machine has the scope of accommodating more virtual machines since they are not working at their full utilization. By migrating the virtual machines of \(PM4\) to \(PM1\), \(PM4\) can be put to hibernate mode or can be turned off, making \(PM1\) work at 80% utilization. Similarly, the virtual machines from \(PM3\) and \(PM5\) can be migrated to \(PM2\). The \(PM2\) now works at 75% utilization. Before consolidation, there were 5 active hosts. However, after consolidation, the number of hosts has reduced from 5 to only 2. Migrating the virtual machines between the hosts can reduce the number of active hosts and minimize energy consumption.

3.3 VM Consolidation Steps

The consolidation process is triggered by the detection of an overloaded or underloaded machine. The time to begin consolidation is when the host load exceeds the upper threshold or goes below the lower threshold. Once the overloaded or underloaded host machines have been identified, the next step is the selection of one or more virtual machines from the overloaded machines for migration. The virtual machines so selected help to bring back the host load to normal. In the case of underloaded machines, all the virtual machines are selected for migration so that after migration the underloaded machine can be turned off or put to an idle mode to save energy. The final step is the placement of the selected virtual machines to some other physical machine. This deployment step demands the migration of the virtual machines from one host to another. Figure 5 shows the steps involved in the process of consolidation.

Fig. 5
figure 5

Process of consolidation

4 State of Art for Energy-Efficient VM Consolidation

This section talks about the seminal work of researchers providing solutions for different steps of consolidation. The process of consolidation can be performed in two ways. The first is the static consolidation whereas another is dynamic consolidation. In the case of static consolidation, the virtual machines are configured only once according to the peak demands of the workloads. This ensures that virtual machines do not get overloaded and stay in the same host throughout their lifetime. Static consolidation though avoids machines from being overloaded is also a major cause of the idleness of machines resulting in resource wastage. This problem can be solved using the method of dynamic consolidation. In the case of dynamic consolidation, the capacities of the machines change dynamically according to the demand in workload. The changing demands may call for the migration of virtual machines on finding a host getting overloaded or underloaded. A categorization of the techniques that are involved in the different steps of dynamic consolidation, across different phases, is shown in Fig. 6. First of all, dynamic consolidation checks for any changes in the workload periodically and performs the update accordingly and occurs mainly for two reasons:

  1. (i)

    When the capacities of the virtual machines exceed the holding capacity of the host.

  2. (ii)

    When the virtual machines can be accommodated to other hosts to switch the host, they are placed on, to sleep mode.

Fig. 6
figure 6

Categorization of VM consolidation techniques

4.1 Consolidation Time Selection

A dynamic consolidation is prompted whenever a host is detected as overloaded or underloaded. To find the load status, various threshold-based techniques have been suggested in the literature. In the threshold-based techniques, an upper or lower threshold is set. Comparing the utilization values with this threshold value, a machine can be recognized as overloaded or underloaded. The threshold can be either static or dynamic. In the case of a static threshold, the threshold values remain fixed throughout and do not change whereas, with a dynamic threshold, the values can change over a period depending upon certain criteria.

4.1.1 Detection Based Upon Static Threshold

Aimed at minimizing the power consumption without SLA being violated, Beloglazov et al. [19] have proposed a technique based upon the double threshold, upper threshold, and lower threshold, to decide upon the time for the consolidation process to begin. The proposed technique tries to keep the CPU utilization of the servers in between these two defined thresholds. If the utilization of any host exceeds the upper threshold, then VM(s) are migrated from this host to some other host, trying to bring the utilization of the overloaded host below the threshold value. The goal is to avoid SLA violations. On the other hand, if the total utilization value is below the lower threshold, then to reduce the power consumption, the virtual machines from the underloaded hosts can be migrated to other locations. Such migrations can help to turn off the hosting parties and thus, the underloaded machines can be switched to sleep mode from being active. This was followed by a static double threshold method with considerations on CPU and disk utilization [20]. Yahya et al. [21] have focussed on setting a threshold based upon the temperature metric of the host, unlike the traditional methods that have focussed on other resources of the system like CPU, RAM or bandwidth. A parameter called excess capacity has been coined by the authors [22] to evaluate the remaining capacities. Excess capacity equals the maximum capacity minus the sum of CPU capacities of all the VMs placed on that host. Xiao et al. [23] have focussed on the utilization of CPU to set double thresholds to decide upon the load status of the host. The two thresholds are set in the static manner dividing the list of hosts into the categories of overloaded hosts, underloaded hosts, and the normally loaded hosts. A summary of Host load detection techniques based on the static threshold is shown in Table 2.

Table 2 Static Threshold-based host load detection techniques

4.1.2 Detection Based Upon Adaptive Threshold

It has been observed that static thresholds are not appropriate for dynamic workloads. To deal with the dynamic nature of the workloads and changing needs of the users in the cloud system, Beloglazov and Buyya [24] have proposed a few methods based on adaptive thresholds. The main idea behind adaptive thresholds is to evaluate the deviations of CPU utilization. The more the deviation, the more are the chances of utilization to reach full utilization and so, the lesser is the set upper threshold. For the adaptive threshold, statistical analysis of historical data is performed. The approaches include Median Absolute Deviation, Inter Quartile Range, and local regression. To measure statistical dispersion MAD is used. It is defined as the median of the deviations of the utilization from the median of the data. This calculated value helps in calculating the upper threshold. IQR is the difference between the first and third quartiles. The method of local regression suffers from the problem of outliers. To deal with this issue, robust local regression (LRR) has been proposed. To improve the efficiency of the network, Chang et al. [25] have predicted the workload on the servers with the help of recurrent neural networks. To minimize the violations of SLA and performance degradation while maintaining the optimal level of energy consumption, Sharma and Saini [26] have introduced a novel technique for the consolidation of virtual machines in the cloud environment. In 2017, Farahnakian et al. [27] have proposed a self-adaptive algorithm for threshold, detection while considering CPU and memory as resources for calculating the utilization. Depending upon the probability of the predicted error, Minarolli et al. [28] have also taken care of the overheads of live migration. For making predictions, the Gaussian processes have been used whereas the kernel density estimation method has been used for modeling the error prediction. To minimize the energy consumption of the cloud computing environment, Dambreville et al. [29] have considered the scheduling problem where each server is assumed to have a variable speed and the power of the servers depends upon the processing speeds. Depending upon the deviations of the demands of the available resource, a probability of being overloaded is calculated by Li et al. [30]. This overload probability is used to calculate the adaptive threshold value. Patel and Patel [31] have tried to predict the maximum number of hosts that can be vacated to normalize the host. This predicted value helps in determining the lower threshold. The work proposed by [31] has been improved by Saadi and El Kafhali [32], considering both, the utilization and the characteristics of the server to minimize the overall consumed energy. For selecting an overloaded host, a self-adaptive technique having dynamicity in nature has been put forward byXie et al. [33]. The proposed technique uses multiple thresholds to check the utilization of the host in the system. In 2018, Zhou et al. [34] have analyzed the application of time series to evaluate ascending and descending trends (ADT) in host overload detection. Table 3 shows an overview of adaptive threshold-based host overload detection techniques. A prediction model based on hybrid recurrent neural network has also been suggested by Karim et al. [35].

Table 3 Adaptive threshold-based host detection techniques

4.1.3 Regression-Based Adaptive Threshold

Based upon the prediction of CPU usage, Farahnakian et al. [36] have presented a novel technique based on linear regression to find out the overloaded and underloaded hosts in the cloud environment. The process of host detection using short-term prediction is initiated out during the live migration of virtual machines while the process of VM consolidation is being carried out. Given the current resource request, the regression model is used to forecast the near future host utilization. For the detection of overloaded hosts, Yadav et al. [37] have proposed two adaptive threshold models based upon regression. One is named as Gradient descent while the other is maximize correlation percentage-based regression. To optimize the load and the consumed power in the computing environment, Hieu et al. [38] have tried to find out the over utilized and underutilized servers depending upon current resource utilization as well as predicted future utilization of the server. The future utilization is estimated based upon the local history of the resource utilization. However, in 2017, Abdelsamea et al. [39] have considered RAM and BW also along with CPU, extending the concept of linear regression, while computing the host utilization. Yadav and Zhang in [40] have developed M estimation regression (MeReg) where regression is used with heuristics. Jararweh et al. [41] have used a static threshold for determining the lower threshold whereas the upper threshold is calculated using MAD [24]. 10% resource utilization is treated as the lower threshold point. Whenever the utilization of any host falls below this predefined threshold, the host is considered to be underloaded. Apart from energy consumption and SLA violations, the authors [42] have laid stress on the robustness of the upper threshold of CPU utilization which helps in maximizing the usage of the resources. Intending to reduce energy consumption, the authors [43] have focussed on maximizing resource utilization by trying to minimize the degree of imbalance between the hosts. Pearson correlation coefficient has been taken into consideration while finding out the overloaded host. Minhaj Ahmad Khan has also proposed a resource utilization based model for the detection of overloaded hosts [44] while bringing in the concept of cumulative available-to-total ratio for determining the unloaded hosts. In Table 4, regression-based host overload detection schemes have been summarized.

Table 4 Regression-based host detection techniques

4.1.4 K Nearest Neighbour (KNN) Based Adaptive Threshold

A prediction method based on K Nearest Neighbour has been introduced by Farahnakian et al. [45] to predict future resource usage. as CPU is considered to be the main source of energy consumption, CPU usage has been forecasted. The prediction is dependent upon the historical data which is collected during the lifetime of the host as an input training set. Farahnakian et al. [46] have modified the best fit decreasing algorithm to place the migrated VMs. The modification so performed undergoes a prediction of CPU utilization in the early stages that has been done with the help of K nearest neighbor regression. To reduce energy consumption, Zhou et al. in [47] have put forward a dynamic threshold method named as adaptive three thresholds aware technique. The technique makes use of historical data on resource utilization. The hosts have been divided into three categories namely highly loaded, moderately loaded, and lightly loaded. Two adaptive algorithms have been proposed. The two algorithms are K means clustering-based. One combined with MAD(KAM) and the other with IQR(KAI). Table 5 summarises the schemes detecting host overload based on the K nearest neighbor method.

Table 5 KNN based host detection techniques

4.1.5 Machine Learning-Based Adaptive Threshold

The CPU utilizations keep on changing frequently depending upon the user's demands. This leads to unnecessary migrations making any SLA violations. To avoid these frequent migrations, the authors have proposed a double threshold mechanism. For overload detection, Hopcroft et al. [48] have used the concept of the Gray Markov model. The model has been used for prediction purposes. If the current load of CPU utilization is greater than the threshold value, then the next three consecutive CPU utilization are checked against the threshold. If they are also greater than the defined threshold, the corresponding host is said to be overloaded. In 2017, Melhem et al. [49] proposed a method based on present and forecasted loads to check if there is a need to migrate the virtual machines. For the current load, status is checked either using a static threshold method or the dynamic one. In the case of static threshold, the value of the lower threshold corresponds to 0.1 and the upper threshold maps to the value 0.9. However, in the case of the dynamic method the lower threshold is set at 0.1 while for the upper threshold calculation, the Median Absolute Deviation method is employed that utilizes analysis of historical data. Markov chains are used to predict the future load status. The overload threshold has been exhibited as a Markov decision process by Li [50]. In addition to that, the optimization of overload threshold selection has also been performed where the Bellman optimality equation has been used to model the threshold problem. Along with CPU utilization, memory requirements have also been considered to model the problem of consolidation. To address the problem of server consolidation, Hsieh et al. [51] have extended the work of [46] by identifying overloaded and underloaded hosts using the present and the forecasted resource utilization. To forecast the CPU utilization, a Gray-Markov model exploiting the historical data of the hosts has been used by the authors. Instead of servers, the authors have focussed on -the future resource utilization of the VM to make more fruitful placements. Markov chains have also been utilized by authors [52] in terms of discrete time to categorize the hosts in terms of their utilizations. In Table 6, host overload detection schemes based upon machine learning have been summarized.

Table 6 Machine learning-based host detection techniques

4.2 VM Selection (VMS)

Once the host has been identified for initiating the migration process, the next step is the selection of a VM from the selected host which can be chosen for migration. This sections briefs about numerous such techniques put forward by different authors. Focussing on the problem of the selection of VMs, Beloglazov et al. [19] proposed three policies namely Random Choice (RC), Minimum of migrations(MM), and highest potential growth (HPG) policy. In the random choice strategy, the virtual machine is selected on a random basis, following the uniform distribution. The scheme selects the virtual machine(s) until the load of the host becomes normal. In a minimum of migrations, the minimum number of migrations have opted that reduce the utilization of CPU resources. The HPG policy prevents SLA violations by selecting the VM with the minimum usage of CPU. Two more strategies for VM selection have been proposed in [24]. The MMT policy helps select the V that requires the minimum time for migration. The other is the policy of maximum correlation (MC). The maximum correlation policy finds the virtual machine that shows the maximum correlation of CPU with other machines. The idea behind this is that the virtual machine showing higher correlation is more likely to cause the overload in the host. In maximum correlation, the associativity of a virtual machine with others is calculated and the machine with maximum correlation is selected for migration. The authors [53] have proposed a technique extending the existing Maximum Correlation Policy. They have used the concept of negative and positive correlation. According to the authors, the virtual machines having positive correlation are more likely to cause load imbalance than the virtual machines with negative correlation. One such machine that has a maximum positive correlation with others is then chosen for transfer. Alboaneen et al. in [54] have focussed on the bandwidth of the available VMs. The VM with maximum requested bandwidth is migrated to another host to reduce the load of the machine on which it is hosted. To select the virtual machine(s) for migration, Hopcroft et al. [48] have considered utilized CPU and utilized memory. Depending upon the contributions made by the CPU of the virtual machines to the total utilization of the host, Masoumzadeh and Hlavacs [55] have proposed a policy called the maximum utilization policy to select the virtual machine with maximum utilization as compared to the rest of the virtual machines. Contrary to this, a minimum utilization policy has also been proposed that selects the virtual machine with minimum utilization. Based upon these two criteria, the virtual machine selection is performed over the overloaded hosts. To select the virtual machine for migration from one heavily loaded host to another, Zhou et al. [47] have proposed three policies for the VM selection namely Minimum Memory Size (MMS), Lowest CPU Utilization (LCU), and Minimum product of both CPU usage and Memory Size (MPCM). In MMS, the virtual machine with minimum allocated memory is selected for migration as it will consume lesser bandwidth. In LCU, the VM with minimum CPU utilization is migrated. The product of CPU utilization and allocated memory is calculated in MPCM and the machine with the minimum value of this product is selected for migration. In 2016, Bala and Chana[56] have used the correlation factor for VM selection In the case of overloaded hosts, the correlation factor between the virtual machine and the resource utilization is calculated. The virtual machine with the maximum correlation factor is selected for further consideration. On reaching zero, the VM migration takes place. Taking the advantage of intelligent decision-making, Alaul et al. [57] have proposed a technique for VM selection based on fuzzy logic. To deal with the dynamic nature of the cloud data center, the authors have provided different kinds of inputs to the fuzzy system to achieve a better trade-off between the consumed energy and the SLA violations. These include the migration time, the correlation factor, and the fluctuations in CPU utilization. The linguistic variables used for the inference engine include RAM, correlation, and the standard deviation. The authors have chosen the trapezoidal membership function as it deals better with the flat regions. A virtual machine with the highest fuzzy output will be selected for migration if its CPU utilization exceeds the defined threshold. Shidik et al. [58] have considered RAM along with the CPU utilization while formulating the VM selection problem. The problem has been resolved using fuzzy logic and the Markov algorithm. Fuzzy logic helps in categorizing the attributes of the candidate virtual machines whereas the Markov model is further employed to find out the best VM according to these attributes. The Triangular membership function of fuzzy logic has been used by the authors in their proposed technique. The fuzzy logic helps in simplifying the production rules for the Markov algorithm to reach an optimal decision. The technique has been able to reduce the energy consumption of the cloud data center. Once an overloaded host has been found out with the help of Pearson correlation, the next step is to find out the virtual machine that is to be migrated. To figure out this issue, the authors [59] have used CPU usage and the allocated memory. According to them, the greater the CPU utilization of a VM, the more amount of processors will be released after its migration. Similarly, the lesser the allocated memory, the lesser is the dependence of the virtual machine on others, the lesser will be the migration cost and the time incurred for migration. To find the potential virtual machine for the migration purpose, Hieu et al. [38] have introduced the concept of temperature ratio. The resource with maximum utilization is considered the hottest resource of the host. In 2017, Mosa and Sakellariou [60] have selected the virtual machines with minimum RAM demand so that a minimal amount of is time required for migration. On contrary, Yadav et al. [61] have performed the selection process based on CPU usage. The virtual machines are sorted in descending order of their CPU utilization. The machine with the minimum size is then selected. Thus, the approach selects the machine with high CPU usage and minimum size. This helps in reducing the amount of energy consumed during VM migration. Yadav et al. [37] have utilized the network bandwidth to find out the virtual machine to be selected for migration. The purpose is to select the VM with the least current utilization and MMT. Thus, out of all the virtual machines in the overloaded host, the one that requires minimum time for a migration is selected for migration. Chang et al. [62] have calculated the impact of the virtual machines on the load of the host. This has been achieved by using the Euclidean distance of the resources. The values of the utilization of the resources have been normalized first, which are then used to calculate the Euclidean distance. Wang and Tianfield [63] have considered CPU utilization while selecting the VM. This has been achieved depending upon the CPU utilization of the VMs. The one with maximum utilization is chosen. If the host remains overloaded, then the virtual machine with the highest CPU utilization is selected next to be migrated. this process is repeated until the host load comes below the threshold value. According to the authors, migrating the VM with the highest CPU utilization provides an opportunity to reduce the load very quickly. Li et al. [64] have presented a VM selection technique based on the concept of content similarity. According to the authors, migrating the virtual machines with high content similarity among their memories will help to save the bandwidth to transfer the data along with the reduction in the amount and data transfer time. Lin et al. [65] have built a reverse selection method while considering all the resources i.e., memory, bandwidth, CPU, and storage. The technique detects the most fitting VM from the set of virtual machines for each randomly selected host. This reverse selection depends upon an iterative algorithm that is dependent on the budget heuristic. To achieve better migration costs, Xie et al. [33] have proposed a VM selection algorithm that takes as input all the available resources instead of considering only CPU. The utilization of these resources are used for the calculation of a variable called the utilization ratio of resources. This parameter gives an estimate of the utilization of all the resources (CPU, RAM, BW). The VM with a higher weight is given more preference over others for migration. Zhou et al. [34] have proposed a memory and CPU usage-dependent technique for selecting the migrant virtual machine. In the case of an overloaded host, the virtual machines are ascendingly arranged and the virtual machine having less usage of memory is preferred over other virtual machines. Choudhary et al. [66] have proposed the concept of fuzzy logic into the process of virtual machine selection process. The inputs to the fuzzy logic constitute the ram utilization, the standard deviation of CPU utilization along correlation coefficients. The SLAV parameters have also been added to the list. All these parameters are fed to the fuzzy function. The virtual machine with the highest fuzzy output is selected for migration if its current CPU utilization exceeds the threshold value. The use of fuzzy logic gives better results as compared to other traditional methods of VM selection. Xiao et al. [23] have introduced three different CPU utilization-based virtual machine selection policies after the host has been identified as overloaded. However, if a host is found to be underloaded, then all of its virtual machines are migrated to some other host. Yadav et al. [42] have calculated the CPU utilization of virtual machines over different time frames. The prediction of MAD values has been used to find suitable virtual machines for reallocation from the overloaded hosts. The predicted values help in finding out which virtual machines will put a lesser load as compared to others. Thus, the virtual machine with a minimum MAD value is selected for migration. To select the virtual machine, Saadi and El Kafhali [32] have calculated the deviation of current host utilization from the upper threshold. The virtual machine having the CPU utilization greater than or equal to this deviation is migrated. The machine selected is expected to have the least migration time. Taking imbalance degree into consideration, Mapetu et al. [43] have proposed an optimization policy to reduce the migration count. The selection criteria are based on the result of the product of maximum MAD and least imbalance degree. Migration delay has been computed with the help of RAM and BW. Table 7 summarizes VM selection policies proposed by numerous authors in the literature.

Table 7 VM selection techniques

4.3 VM Placement (VMP)

Once the migration virtual machine has been selected, the final job is to find a suitable host that can house this migrated virtual machine. The methods used for finding a host for the deployment of the virtual machine include Greedy Heuristics, Constraint Satisfaction Programming, Linear Programming, and Meta-Heuristic Techniques.

4.3.1 VMP Based on Greedy Heuristics

Conventionally, in the Bin Packing Problem, some specific sized objects are to be packed into bins of fixed sizes. The primary goal is to reduce the total number of bins that can be used to hold the given set of items. Virtual machine placement problem(VMPP) can be modeled as the Bin Packing problem(BPP) where the number of available hosts corresponds to the total bin count while the virtual machines that have been selected for migration can be thought of as the objects of variable sizes that need to be packed in the bins. The Bin Packing problem belongs to the category of is NP-hard problems. Mishra and Sahoo [67] studied the use of various greedy algorithms and have put forward the shortcomings with various placement algorithms. Trying to overcome these drawbacks, the authors have tried to represent the placement problem with the help of a mathematical model. Vector arithmetic has been used to represent the VM placement as an object in two-dimensional space with the resources as the dimensions of the object. Motivated by greedy algorithms, the authors [68] have proposed an algorithm called the SERCON. The algorithm inherits the usefulness exhibited by First Fit and Best Fit. The concept behind the proposed scheme is that the least loaded hosts should free their capacities if other hosts can accommodate the virtual machines placed on the existing low-loaded hosts. To achieve this, the very first step is the decreasing sorting of the host machines based on their loads. Once the hosts have been sorted, the virtual machines are selected from the least loaded host based on the calculated scores. For the calculation of scores, the authors have considered CPU and memory resources. In an attempt to reduce the amount of energy consumed, Anton et al. [19] have presented an energy-aware VM placement technique. The proposed scheme is a modification of BFD. The idea behind the proposed work is to choose the host which leads to a minimal rise in the level of power consumption after the placement of the virtual machine. Therefore, the most power-efficient host is chosen first. Huang and Tsang [69] have modeled the virtual machine consolidation as the minimum cost optimization problem that aims to place the virtual machines on a given group of hosts while reducing the service level agreement violations. The problem has been simplified by fixing the number of hosts in advance. This sub-problem has been modeled as the M-convex problem and to solve the problem, the lagrangian method has been used. Fu and Zhou [70] have proposed a scheme for virtual machine placement that offers to compute correlation coefficient during the consolidation process. The correlation coefficient helps in finding the associativity between two objects. The objects mentioned referring to hosts and virtual machines in this case. Thus, the host has a minimum correlation with the migrated VM is chosen for placing it. Esfandiarpoor et al. [71] have considered the network structure and the cooling aspect of the data center for the VMP during the process of consolidation. This reduced the rack count and routers in active mode. The proposed scheme, OBFD, is a variation of modified BFD. Instead of sorting the virtual machines in terms of CPU utilization, they have been arranged in decreasing the order of the requested MIPS. Also, three structure-based virtual machine placement algorithms have been presented to reduce the racks in active mode. To avoid violations of SLA and frequent migrations, Farahnakian et al. [46] have put forward two constraints that need to be checked while migrating the virtual machines. According to the first constraint, the host selected for the placement of the virtual machine should have sufficient resources to accommodate it. The second constraint says that the host should not get overloaded in near future, provided the host holds sufficient resources to cater to the demands. Mosa and Paton [72] designed VMPP with the help of utility functions. The proposed placement strategy helps in optimizing both energy and service level agreements. Intended to benefit the IaaS provider by maximizing the utilization of the available resources, the scheme uses a genetic algorithm to find out the possible solutions to the placement of a VM to a host. The initial population has been generated on a random basis. However, the crossover operation has been performed using the uniform crossover technique where swapping is performed beyond the crossover point. The proposed utility-based method can be applied only if there is an adaption that can be improved for the current allocation. Zhou et al. [47] have extended the best fit decreasing algorithm and have presented its modified version named as Energy-Aware Best Fit Decreasing (EABFD). If the host has sufficient resources to place the migrated machine, then the utilization of the host after the placement of the machine is checked. If the utilization lies between the limits of moderate loaded and lightly loaded machines, then the host is added to the list of considered hosts. All these hosts are then checked for the raised power consumption after the placement of the migrated virtual machine. The host with minimal power consumption increase is selected for the deployment of the migrated virtual machine. The proposed framework VMCUP-M by Hieu et al. [38] is an extension of the PABFD algorithm. Where the authors considered multiple resource usage. This multiple resource usage is the predicted value which helps in reducing the chances of the target host being overloaded in the future. The forecasted utilization value and the power increased after the placement of the virtual machine act as the parameters for the selection of the host. The host with a minimal increase in power consumption is selected as the target host. The hidden Markov model has been used by Hammer et al., 2017[73] to tackle the VMPP. In the proposed scheme, the Hidden Markov Model helps in replicating the consumption of the CPU and its properties in a realistic way. The time of the days decides upon the changeover of the CPU state from being highly active to inactive. Markov model has also been used by Rajabzadeh and Toroghi in [59] for solving VMPP. The presented scheme is an extended version of PABFD [24] with the incorporation of simulated annealing in the Markov model. The deployment of VMs is performed in such a manner that does not leave the selected host overloaded after migration. With the use of interquartile range, the underloaded and the overloaded hosts will act as outliers, leaving the set of remaining hosts that can be used for the placement of the virtual machines. For these candidate hosts, an energy factor called energy expenditure is computed. The lesser the value of this parameter, the more are the chances of the corresponding host getting selected as the destination for the VMP. To maintain the balance between energy, QoS, and temperature, Zahedifard et al. [74] have introduced a technique that takes care of virtual machine placement during the consolidation process. The hosts have been classified as high category and low category. Some servers consume more energy while performing fewer operations per second. Such servers fall into low category servers. The aim is to prioritize the high category hosts while trying to minimize the low category hosts' inactive state. Out of the list of high category hosts, the host with minimum power increases and more resource capacity after the placement of the virtual machine is selected for the deployment process. The hosts with more resource capacity are expected to contribute less to frequent migrations. Patel and Patel [31] have presented a utilization aware selection and deployment scheme from the underloaded hosts. The over-utilized and the turned-off hosts are omitted from being selected as the candidate hosts for new virtual machine placements. This leads to reduced computation costs. Motivated by the work of [24], Wang and Tianfield[63] have proposed a technique called Space Aware Best Fit Decreasing. In the suggested scheme, the VMs that need to be migrated are sorted in decreasing order of their CPU utilization. Thus, the VM having maximum utilization would be the first one to get migrated. Three correlation-based algorithms have been presented by Chen et al. [75] to generate an effective placement scheme. Neural networks have been employed to predict resource utilization based upon historical data. This predicted value is then fed to the placement algorithms. The first proposed scheme is a modification of first fit and authors have termed it as correlation aware first for virtual machine placement algorithm. The second algorithm that has been proposed is a modification of the best fit VMP algorithm. The third one is the Minimum Bin slack algorithm. Arani [76] has presented a learning-based placement technique based on Best fit decreasing algorithm. For learning purposes, the authors have used correlation coefficient along with ensemble prediction and automata theory. Incorporating these three with the best fit decreasing algorithm makes the decision-making better. Gupta and Amgoth [77] have proposed a resource-aware scheme to reduce the count of active hosts. The presented scheme is completely resourced aware and deploys the VM on the host with the highest utilization. To target such a kind of placement, the authors have put forward a parameter called Resource Usage factor, RUF. RUF is dependent upon average resource utilization, the normalized value of the resource utilization and the normalized value of the remaining resources. Higher the value of RUF means higher are the chances of the host of being available for placing the migrated virtual machines. The approach proposed by Xie et al. [33] for VMP is an advancement of the Best Fit Algorithm. In the proposed multi-weight best-fit algorithm, the deviation of the threshold from the actual utilization of the resources, i.e., CPU, RAM, and BW, is computed. The host with a minimal increase in power usage is chosen for housing the migrated virtual machine. Mishra et al. [78] have presented an energy-efficient scheme for the placement of virtual machines. In the proposed scheme, the virtual machines are first divided into different categories depending upon their resource requirements. These categories include CPU47 intensive, input–output intensive, memory-intensive, and bandwidth-intensive. The host with minimum remaining capacity is the first one to be checked. If no host in active mode can house the virtual machine, then one of the hosts from the list of hosts in sleep mode is made to wake up and the virtual machine is deployed in there. Haghshenas and Mohammadi [79] have presented two approaches based on linear regression to find the suitable destination host. The first technique called the OLLRBA, the One-level linear regression-based approach, and TLLRBA, the Two-level linear regression-based approach, used linear regression at two different points. Mapetu et al. [43] have proposed a Pearson-based PABFD algorithm for seeing into the virtual machine placement problem of the consolidation process. The Pearson correlation coefficient uses the RAM and CPU utilization. The VM is housed by the server which shows the least increase in the power consumption difference. The authors in [32] have extended the work of [31]. The authors stated that the association of the power efficiency and CPU utilization differs from server to server depending upon their technical specifications. Hosts operating at the same CPU utilization may differ in their power consumption. Thus, the type of the server has also been taken into consideration while calculating the CPU utilization. Zang et al. [9] has put forward a reinforcement learning based technique for machine learning. VM placement techniques based on Greedy Heuristics have been summarized in Table 8.

Table 8 Greedy heuristics-based VM placement techniques

4.3.2 VMP Based on constraint Satisfaction Problem

The aim of the Constraint Satisfaction Problem (CSP) is to evaluate the value for a variable from a given domain of values while satisfying a set of constraints. The problem of VMP can also be modeled as CSP. CSP can be thought of as logical programming which is mainly used to solve complex combinatorial problems. Dupont et al. [80] suggested an energy-aware algorithm for the reallocation of virtual machines. The disassociation of the constraints from the algorithm makes its flexible nature. The objective is to minimize energy consumption and CO2 emissions. Zhang et al. [81] have presented a constraint programming-based allocation model that is capable of handling different types of workloads like CPU intensive or memory-intensive or I/O intensive. At the local level, the utilizations are evaluated to find out the load on the host whereas the second level corresponds to the global functioning of packing the virtual machines on suitable hosts. Tchana et al. [82] have introduced software consolidation into the dynamic VM consolidation. The proposed scheme is based on constraint programming and has been used for live migration where the goal is to reduce energy and save operational costs. The authors have relied on the consolidation of the software which can be used in both public as well as a private cloud. Table 9 summarizes the constraint satisfaction problem-based VM placement techniques.

Table 9 CSP based VM placement techniques

4.3.3 VMP Based on Linear Programming (LP)

Unlike logical programming, linear programming utilizes estimation models for finding the results with the help of probability distributions. They majorly cover the area of solution finding where the future demands are not certain. Thus, in the cases where the future demands of the machines for resources are not known or certain, the concept of linear programming can be used. Tseng et al. [84] have presented a virtual machine placement as the service-oriented optimization scheme based on integer linear programming. Based on graph theory, a tree and a forest algorithm have been proposed to deal with this optimization problem. Zeng et al. [85] have proposed a network-aware VM placement technique where they have formulated the problem in terms of linear programming. The requirement of the VMs in terms of resources, the traffic between the VM pair, and the rack-wise architectural description of the network form the input to the linear programming. The VMs with high traffic rates are given priority in the sense that, the VMs with heavy traffic are paired to be placed onto the same server. Huang and Tsang[86] have proposed a framework that automates the process of VM consolidation intending to improve the allocation of virtual machines and servers. M convexity theorems and functions have been used to find the solution. Table 10 summarizes linear programming-based VM placement techniques.

Table 10 LP-based VM placement techniques

4.3.4 VMP Based on Meta-heuristics

Being NP in nature, the problem of VM placement can be modeled as an optimization problem. To handle and solve the optimization problem, various meta-heuristic techniques algorithms have been focussed upon in the literature. Particle Swarm Optimization has been used by various authors to model the problem by VM placement. Ramezani et al. [88] solved the VMP problem using PSO along with the fuzzy logic. The objective is to minimize power consumption while improving resource utilization and transfer time. Resource utilization has been improved by minimizing both, the idle memory and the idle CPU. The fuzzy logic has been used for controlling the inertia weight in the basic PSO. With the help of the presented scheme, the network traffic has been greatly reduced. While trying to minimize the energy consumption, Hongjian Li et al. [20] also mapped the virtual machine placement optimization problem to particle swarm optimization. To avoid getting trapped into local optima, the authors have incorporated probability findings based upon the Bayesian formula to find the global best and local best positions. Abdessamia et al. [89] have evaluated three inertia factors based upon the local and best positions. These calculated values help in finding the optimal solution and correspond to the probabilities of selecting the host with maximum energy consumption. In 2018, Tripathi et al. [90], have used binary PSO to optimize VM placement by taking energy usage and utilization of resources into consideration. The objective is to lessen energy consumption while maximizing resource utilization. Trying to minimize the number of hosts in active mode, Yan et al. [91] have concentrated on memory and CPU utilization as parameters of the objective function in discrete PSO. OpenStack has been used to evaluate the capabilities of the proposed algorithm and has been compared with the native VM scheduler of OpenStack. Another PSO based placement technique has been proposed by Kirana and Mello [92] for consolidating the virtual machines, trying to improve the energy efficiency. Several ant colony-based virtual machine placement algorithms have been proposed by different authors. In 2013, Gao et al. [93] proposed a technique to find a non-dominant set of solutions using the Pareto dominance concept, aimed at reducing power consumption and resource wastage. To fully utilize a host, the resource wastage should be minimum. Motivated by [93], Malekloo and Kara [94] have proposed an algorithm combining the properties of heuristic information and probability-based decision rules, with ant colony optimization. The energy consumed by the network elements has been computed. K- shortest path algorithm has been used to find the number of network elements between two VMs. Shabeera et al. [95] have optimized the virtual machine allocation by calculating the distance between two physical machines. The distance is measured in terms of network devices that have been used to connect them. The objective is to minimize the delay between these devices. To deal with the virtual machines having different arrival times and having variable requirements at different time slots, Liu et al. [96] have introduced an energy-aware scheme with multi-objective optimization. A research project is done by Ashraf and Porres [97] employing an ant colony system. Carrying forward the work done by [96], Xiao et al. [23] have also tried to optimize energy and the number of migrations using an ant colony system. However, the authors have put forward a different energy model that considers the energy consumed by the host not only in active mode but also in sleep mode as well as while switching between these modes. Alharbi et al. [98] have presented an ACO-based VM placement technique for reducing the amount of consumed energy. The proposed scheme is profile-based and thus uses the information from the profiles of virtual machines and host machines. The profile of the host included information like host CPU and memory utilization, their total and remaining capacities, the energy consumed at peak and idle times. In recent years, the dragonfly algorithm has been used for optimization in VM consolidation. In 2018, More and Ingle [99] have proposed a multi-objective dragonfly algorithm for the placement of VMs during the consolidation process. The optimal solution has been found by incorporating crow search into a dragonfly. Keeping in view better resource utilization, Tripathi et al. [100] have tried to reduce the resource wastage for the VM deployment using dragonfly optimization. The objective is the minimization of the CPU and memory wastage. To maintain a balance between exploration and exploitation, a time-dependent transfer function has been incorporated into the binary dragonfly. Goyal et al.[101] has presented a whale optimization-based resource allocation scheme for making the cloud environment more energy efficient. Researchers [102,103,104,105,106] have also employed a Genetic Algorithm to find the solution to VM Placement. For the placement of VMs in geographically distributed data centers, Teyeb et al. [107] have put forward a traffic-aware Bio Geography-based optimization-based scheme. The main idea behind the proposed technique is to reduce the number of migrations. Due to the addition and removal of virtual machines, there exists an imbalance of load in the host machines. To deal with this imbalance degree, Li et al. [108] have proposed a multi-objective rebalance solution aimed at maintaining the intra-host and inter-host load imbalance of each dimension. Dimension corresponds to different resources of the host. Migration cost has been effectively reduced by the presented scheme. Medara et al. [109]. has introduced water wave optimization for VM consolidation. Meta-Heuristics-based VM placement techniques have been summarized in Table 11.

Table 11 Meta-heuristics based VM placement techniques

5 Consolidation Based Solutions for Energy Efficiency in CC Environment

This section provides a summary of different VM consolidation techniques discussed in the previous section. The proposed schemes have been classified on the basis of different parameters and thereafter a taxonomy has been developed providing the solutions to the process of consolidation.

5.1 Analysis of Different Consolidation Schemes

Tables 12 and 13 have been formulated after analyzing different consolidation schemes proposed by various researchers over the past few years. They provide the summary of various dynamic virtual machine consolidation schemes studied in Sect. 3. The classification has been done based on the deployed architecture, prediction method used, the step of consolidation involved, the resources used and the objectives achieved.

Table 12 Analysis based on architecture, VM consolidation step, and resources
Table 13 Analysis based on prediction and objectives

6 Discussion

With the help of the conducted survey, we have been able to figure out the different aspects of the seminal work put forward by various researchers in the field of consolidation of virtual machines in cloud computing environment. The state of art has been classified on the basis of distinguished parameters like the underlying architecture, the resources utilized, the consolidation step involved and the objectives achieved. Based upon the survey of the existing literature, the classification of the solutions to the consolidation process has been developed and is presented in Fig. 7. These solutions can be broadly classified into the categories of Optimization and Statistical methods. The optimization methods help in finding the solutions to the problem that tend to maximize or minimize certain defined objectives. In contrast to this, statistical methods work on the raw data and try to analyse it with the help of mathematical formula or technique.

Fig. 7
figure 7

Taxonomy for consolidation solutions

6.1 Optimization Methods

To solve the problem of Dynamic Virtual Machine Consolidation, various optimization techniques have been followed. These optimization techniques can be either heuristic-based or meta-heuristic-based. Heuristics techniques are problem-dependent and are usually based on local search. It may not guarantee an optimal solution but an approximate goal is achieved. These methods can be applied to a large number of problems. Heuristic techniques can be categorized as deterministic and probabilistic. Given a set of input values, the deterministic methods produce a single conclusion whereas, with probabilistic methods, a degree of randomness is involved in the process of solution-finding and objective completion.

The probabilistic methods include game theory models, bin packing techniques, and Markov Chains. Next fit, best fit, worst fit, best fit decreasing are examples of bin packing techniques. However, meta-heuristic techniques help in finding the near-optimal solution and are not problem-dependent. Evolutionary and swarm-based form the categories of metaheuristic methods. The evolutionary methods are inspired by natural mechanisms like reproduction and crossover. There are many evolutionary algorithms proposed by different researchers. A genetic algorithm is one such example. Swarm intelligence-based algorithms are inspired by the behavior of animals or birds in nature. These algorithms are motivated by the interactions and communications occurring between the animals. Algorithms like Ant Colony Optimization, Particle Swarm Optimization, Cuckoo Search, Dragonfly, etc. fall under this category.

6.2 Statistical Methods

Statistical methods of consolidation involve approximation techniques, machine learning techniques, and exact methods. The approximation methods include the methods based on fuzzy logic and queuing theory. Earliest finish time, earliest deadline first, first in first out, Last in first out are examples of queuing model-based methods. Machine learning techniques help in analysing the activity patterns to reduce the human intervention[110]. Regression, k-means, Markov chains, Neural networks, Bayesian networks, reinforcement learning [9, 9], and model predictive control methods come under the category of machine learning techniques while considering the virtual machine consolidation. The exact methods are used to find an exact solution to a small-sized problem. Linear, integer linear programming and constraint satisfaction programming are part of exact methods.

7 Research Gaps

Various gaps in the existing literature have been highlighted in Fig. 8, that have been scrutinized after analyzing the seminal work of various researchers in the relevant field.

  1. (1)

    With the help of server consolidation, multiple virtual machines are stacked onto a single machine. However, this raises the concern of single-point failure. An occurrence of server down may lead to the destruction of all the virtual machines housed on it.

  2. (2)

    Certain applications are sensitive to delays such as audio and video conferencing. Resource allocation to such applications requires proper handling and concern.

  3. (3)

    In the cloud environment, the cloud providers receive simultaneous requests for resources, leading to a race of resources among these requests. In the last decade, researchers have focussed a lot on providing solutions to appropriate and efficient load balancing depending upon current resource utilization. However, predicted future loads and past utilization are not much analysed. Thus, how to balance the VM loads considering the past and future situation is another research challenge.

  4. (4)

    Shared resources may cause contention during consolidation, further degrading the performance of the applications and raising the issues of SLA violations. Another important challenge that should be taken into account is the interference among VMs, because migrating some of the correlated VMs may degrade service due to network latency especially over WAN links. Therefore, the situation of interference among VMs should also be taken care of while trying to consolidate the VMs.

  5. (5)

    Researchers have paid a lot of interest to the single VM migration. But the improvement in the system’s performance demands the migration of multiple Virtual machines. Thus, scheduling multiple VM migrations is the main challenge to optimize the performance of the cloud data center.

  6. (6)

    Current load balancing and load detection techniques are static in nature and may not be suitable for dynamic workloads. The dynamic nature of the workload demands the development of adaptive schemes for load detection.

  7. (7)

    Maximizing the RAM utilization capacity may increase the chances of new user requests being dropped. Also, higher levels of CPU utilization raise the server temperature. This, in turn, demands cooling systems leading to more energy consumption. To guarantee the important goals of energy efficiency and QoS, the mapping of VMs to PMs should be intended towards maintaining a trade-off between the efficient utilization of resources and minimized SLA violations.

  8. (8)

    Resource management is the biggest challenge of cloud computing since cloud environment has multiple resources types. Whenever the cloud service provider receives a user request, the resources are allocated according to the need of the users. Where CPU is the most used resource, graphics processing unit (GPU) has also been incorporated to cloud computing environment for handling graphics-intensive workloads. According to the studies, integration of both the PUs can accelerate the performance of the system.

  9. (9)

    VMP can be considered as an optimization problem of the consolidation process which is inherently discrete in nature. The existing optimization techniques often fall in local optima thereby, not being able to reach the global best solution. Thus, designing new metaheuristic algorithms or improving the existing techniques to overcome this issue of the Virtual Machine Placement problem is an important area of research.

Fig. 8
figure 8

Gaps in the existing literature

8 Proposed VM Consolidation Framework

In an attempt to fulfil some of the gaps of the existing literature, we propose a VMC as shown in Fig. 9. While selecting the overloaded host, emphasis should be laid on considering both current and past/future utilizations. This has led us to focus on the scaled estimation techniques for host overload detection. A scale estimator helps in forecasting the future values depending upon the current and past set of certain values. Bi-weight mid correlation shows it advantages over standard Pearson correlation techniques in terms of robustness and resistance of efficiency. Which helps in dealing with one of the major challenges of VM selection in the process of VM Consolidation.

Fig. 9
figure 9

Proposed VMC framework

Being NP hard in nature, the problem of VM placement can be best resolved with the help of the metaheuristic techniques. Since, Dragonfly has proved to be superior in terms of performance as compared best known meta-heuristic techniques [112], we propose use an adaptive binary dragonfly method for solving the problem of VM placement.

9 Conclusion

The advances in virtualization technology have proved to be a boon for revolutionizing IT and academics. The revolution in the IT industry over the past decade has led to an increased demand of cloud computing resources and has resulted into a great pace escalation in the number of users requesting for resources in Cloud Data Centers. This results in a situation where the tasks compete for resources and the outcome is huge energy consumption. Such a condition demands for appropriate resource management mechanisms and energy efficient schemes. Within a data center, the dynamic consolidation process is considered to be one of the most efficient methods for reducing the energy consumption. Virtualization and consolidation of VMs have been widely accepted by researchers for improvising resource utilization in the CDCs. Various static and dynamic approaches used by different researchers for the Virtual Machine Consolidation Process have been discussed in this research paper. In the current research, a detailed analysis of consolidation time selection, VM selection, and VM deployment has been carried out. Observing the gaps in the literature, a framework for VM Consolidation has also been proposed, the implementation of the same will be carried out in future.