Abstract
Recently, Hadoop has been used extensively to process a large amount of data. However, it still faces resource allocation and load imbalance issues in a heterogeneous environment. The objective of this work is to present an efficient resource allocation approach based on multi-criteria decision making to assign resources required by the given job in a heterogeneous Yarn cluster. The proposed model considers node and job heterogeneity as constraints to achieve the best resource allocation while maintaining multiple performance criteria (CPU, Disk, Network and Memory) in real time. It is applied to Yarn architecture using a modified analytical hierarchy process (AHP). This work aims at mitigating load imbalance and improve the resource use when jobs and machines have heterogeneous characteristics. The implemented model provided better cluster resource utilization and reduced the job completion time over comparable Hadoop schedulers FIFO, Fair and TMSA, by 38.3%, 19.4% and 15%, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
RI is the average CI of 500 randomly-filled matrices defined by Saaty.
References
Awaysheh, F., Alazab, M., Garg, S., Niyato, D., Verikoukis, C.: Big data resource management & networks: taxonomy, survey, and future directions. IEEE Commun. Surv. Tutor. (2021)
Postoaca, A., Pop, F., Prodan, R.: h-Fair: asymptotic scheduling of heavy workloads in heterogeneous data centers. In: 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 366–369 (2018)
Shu-Jun, P., Xi-Min, Z., Da-Ming, H., Shu-Hui, L., Yuan-Xu, Z.: Optimization and research of Hadoop platform based on FIFO scheduler. In: 7th International Conference on Measuring Technology and Mechatronics Automation, pp. 727–730 (2015)
Sharma, G., Ganpati, A.: Performance evaluation of fair and capacity scheduling in Hadoop Yarn. In: 2015 International Conference on Green Computing and Internet of Things (ICGCIoT), pp. 904–906 (2015)
Saaty, T.: Decision Making for Leaders: The Analytic Hierarchy Process for Decisions in a Complex World. RWS Publications, Pittsburgh (1990)
Wang, M., Wu, C., Cao, H., Liu, Y., Wang, Y., Hou, A.: On mapReduce scheduling in Hadoop yarn on heterogeneous clusters. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering, pp. 1747–1754 (2018)
Bawankule, K., Dewang, R., Singh, A.: Historical data based approach for straggler avoidance in a heterogeneous Hadoop cluster. J. Amb. Intell. Hum. Comput. 12, 9573–9589 (2021)
Paik, S., Goswami, R., Roy, D., Reddy, K.: Intelligent data placement in heterogeneous Hadoop cluster. In: International Conference on Next Generation Computing Technologies, pp. 568–579 (2017)
Naik, N., Negi, A., Br, T., Anitha, R.: A data locality based scheduler to enhance MapReduce performance in heterogeneous environments. Futur. Gener. Comput. Syst. 90, 423–434 (2019)
Thu, M., Nwe, K., Aye, K.: Replication based on data locality for Hadoop distributed file system. In: 9th International Workshop on Computer Science (2019)
Delgado, P., Didona, D., Dinu, F., Zwaenepoel, W.: Kairos: preemptive data center scheduling without runtime estimates. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 135–148 (2018)
Pandey, V., Saini, P.: How heterogeneity affects the design of Hadoop MapReduce schedulers: a state-of-the-art survey challenges. Big Data, 72–95 (2018)
Javanmardi, A., Yaghoubyan, S., BagheriFard, K., Parvin, H.: An architecture for scheduling with the capability of minimum share to heterogeneous Hadoop systems. J. Supercomput. 77(6), 5289–5318 (2021)
Xu, H., Lau, W.: Optimal job scheduling with resource packing for heterogeneous servers. IEEE/ACM Trans. Netw. 29, 1553–1566 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hosni, E., Kolsi, N., Chaari, W., Ghedira, K. (2022). Resource Allocation Strategy on Yarn Using Modified AHP Multi-criteria Method for Various Jobs Performed on a Heterogeneous Hadoop Cluster. In: Bădică, C., Treur, J., Benslimane, D., Hnatkowska, B., Krótkiewicz, M. (eds) Advances in Computational Collective Intelligence. ICCCI 2022. Communications in Computer and Information Science, vol 1653. Springer, Cham. https://doi.org/10.1007/978-3-031-16210-7_49
Download citation
DOI: https://doi.org/10.1007/978-3-031-16210-7_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16209-1
Online ISBN: 978-3-031-16210-7
eBook Packages: Computer ScienceComputer Science (R0)