Abstract
Resource scheduling has played a crucial role for improving resource utilization of server cluster. There are five types of scheduling architectures that are being used widely for scheduling resource in server clusters that includes statically partitioned schedulers, monolithic schedulers, two-level scheduling, shared-state scheduling and distributed schedulers. In this paper, several scheduling architectures will be discussed. This paper also illustrates key techniques of these scheduling architectures, including resource representation and sharing model, scheduling algorithms and some other techniques. Different scheduling techniques are being applied in different scheduling architectures. Based on this review paper, it can be concluded that there are a lot of works related to scheduling strategies in large-scale cluster are conducted. However, the relatively complicated application and scaling cluster size present new requirements to scheduling techniques. Then some scheduling techniques can still be improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of Conference on Symposium on Operating Systems Design & Implementation, vol. 51, pp. 107–113 (2004)
Bouteiller, A., Cappello, Herault, F.T., Krawezik, G., Lemarinier, P., Magniette, F.: MPICH-V2: a fault tolerant MPI for volatile nodes based on pessimistic sender based message logging. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, pp. 1–17. ACM (2003)
Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European Conference on Computer Systems, pp. 265–278. ACM (2010)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of USENIX Conference on Networked Systems Design and Implementation, pp. 141–146. USENIX Association (2012)
Apache Storm (2014). http://storm.apache.org/
Apache Tez (2014). http://tez.apache.org/
Boutin, E., Ekanayake, J., Lin, W., Shi, B., Zhou, J., Qian, Z., et al.: Apollo: scalable and coordinated scheduling for cloud-scale computing. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2013), pp. 285–300. ACM (2013)
Verma, A., Pedrosa, L., Abd-El-Malek, M., Korupolu, M., Oppenheimer, D., Tune, E., Wilkes, J.: Large-scale cluster management at Google with Borg. In: Proceedings of the Tenth European Conference on Computer Systems (EuroSys 2015), pp. 18. ACM (2015)
Narayanan, A.: Tupperware: containerized deployment at Facebook (2014). http://www.slideshare.net/dotCloud/tupperware-containerized-deployment-at-facebook
Apache Aurora (2014). http://aurora.incubator.apache.org/
Zhang, Z., Li, C., Tao, Y., Yang, R., Tang, H., Xu, J.: Fuxi: a fault-tolerant resource management and job scheduling system at internet scale. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 1393–1404. VLDB Endowment Inc. (2014)
Murthy, A.C.: The Next Generation of Apache MapReduce (2012). http://developer.yahoo.com/blogs/hadoop/nextgenerationapache-hadoop-mapreduce-3061.html
Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R., Shenker, S., Stoica, I.: Mesos: a platform for fine-grained resource sharing in the data center. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, pp. 429–483. USENIX Association (2013)
Schwarzkopf, M., Konwinski, A., Abd-El-Malek, M., Wilkes, J.: Omega: flexible, scalable schedulers for large compute clusters. In: Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys 2013), pp. 351–364. ACM (2013)
Ousterhout, K., Wendell, P., Zaharia, M., Stoica, I.: Sparrow: distributed, low latency scheduling. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP 2013), pp. 69–84. ACM (2013)
Delgado, P., Dinu, F., Kermarrec, A.M., Zwaenepoel, W.: Hawk: hybrid datacenter scheduling. In: Proceedings of 2015 USENIX Annual Technical Conference (USENIX ATC 2015), pp. 499–510. USENIX Association (2015)
Delimitrou, C., Sanchez, D., Kozyrakis, C.: Tarcil: reconciling scheduling speed and quality in large shared clusters. In: Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC 2015), pp. 97–110. ACM (2015)
VMware VCloud Suite. http://www.vmware.com/products/vcloud-suite/
IBM Platform Computing. http://www-03.ibm.com/systems/technicalcomputing/platformcomputing/products/clustermanager/index.html
Banga, G., Druschel, P., Mogul, J.C.: Resource containers: a new facility for resource management in server systems. In: Proceedings of Symposium on Operating Systems Design and Implementation, vol. 22, pp. 45–58. USENIX Association (1970)
Docker Project (2014). https://www.docker.io/
Kubernetes (2014). http://kubernetes.io
Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-Aware cluster management. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, vol. 42(4), pp. 127–144. ACM (2014)
Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: fair scheduling for distributed computing clusters. In: Proceedings of the 22nd Symposium on Operating System Principles, pp. 261–276. ACM (2009)
Tiwari, N., Sarkar, S., Bellur, U., Indrawan, M.: Classification framework of MapReduce scheduling algorithms. ACM Comput. Surv. 47(3), 1–38 (2015)
Chen, Y., Ganapathi, A., Griffith, R., Katz, R.: The case for evaluating MapReduce performance using workload suites. In: Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 390–399. IEEE Computer Society, 2011
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In: Proceedings of the Third ACM Symposium on Cloud Computing, pp. 7:1–7:13. ACM (2012)
Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S., Stoica, I.: Dominant resource fairness: fair allocation of multiple resource types. In: The Proceedings of USENIX Conference on Networked Systems Design and Implementation, pp. 323–336. USENIX Association (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
He, L., Qiang, Z., Zhou, W., Yao, S. (2017). A Review of Resource Scheduling in Large-Scale Server Cluster. In: Uden, L., Lu, W., Ting, IH. (eds) Knowledge Management in Organizations. KMO 2017. Communications in Computer and Information Science, vol 731. Springer, Cham. https://doi.org/10.1007/978-3-319-62698-7_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-62698-7_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62697-0
Online ISBN: 978-3-319-62698-7
eBook Packages: Computer ScienceComputer Science (R0)