Abstract
The performance of supercomputer schedulers is greatly affected by the characteristics of the workload it serves. A good understanding of workload characteristics is always important to develop and evaluate different scheduling strategies for an HPC system. In this paper, we present a comprehensive analysis of the workload characteristics of Kraken, the world’s fastest academic supercomputer and 11th on the latest Top500 list, with 112,896 compute cores and peak performance of 1.17 petaflops. In this study, we use twelve-month workload traces gathered on the system, which include around 700 thousand jobs submitted by more than one thousand users from 25 research areas. We investigate three categories of the workload characteristics: 1) general characteristics, including distribution of jobs over research fields and different queues, distribution of job size for an individual user, job cancellation rate, job termination rate, and walltime request accuracy; 2) temporal characteristics, including monthly machine utilization, job temporal distributions for different time periods, job inter-arrival time between temporally adjacent jobs and jobs submitted by the same user; 3) execution characteristics, including distributions of each job attribute, such as job queuing time, job actual runtime, job size, and memory usage, and the correlations between these job attributes. This work provides a realistic basis for scheduler design and comparison by studying the supercomputer’s workload with new approaches such as using Gaussian mixture model, and new viewpoints such as from the perspective of user community. To the best of our knowledge, it’s the first research to systematically investigate the workload characteristics of a petascale supercomputer that is dedicated to open scientific research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chiang, S.-H., Vernon, M.K.: Characteristics of a Large Shared Memory Production Workload. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 159–187. Springer, Heidelberg (2001)
Christodoulopoulos, K., Gkamas, V., Varvarigos, E.: Statistical analysis and modeling of jobs in a grid environment. Journal of Grid Computing 6, 77–101 (2008)
Cirne, W., Berman, F.: A comprehensive model of the supercomputer workload. In: IEEE International Workshop on Workload Characterization, pp. 140–148 (2001)
Cleveland, W.S.: Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association 74(368), 829–836 (1979)
Denneulin, Y., Romagnoli, E., Trystram, D.: A synthetic workload generator for cluster computing. In: International Parallel and Distributed Processing Symposium, p. 243 (April 2004)
Feitelson, D.G.: Workload Modeling for Performance Evaluation. In: Calzarossa, M.C., Tucci, S. (eds.) Performance 2002. LNCS, vol. 2459, pp. 114–141. Springer, Heidelberg (2002)
Li, H.: Workload dynamics on clusters and grids. The Journal of Supercomputing 47, 1–20 (2009)
Li, H., Groep, D., Wolters, L.: Workload Characteristics of a Multi-cluster Supercomputer. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 176–193. Springer, Heidelberg (2005)
Li, H., Muskulus, M.: Analysis and modeling of job arrivals in a production grid. SIGMETRICS Perform. Eval. Rev. 34, 59–70 (2007)
Li, H., Wolters, L., Groep, D.: Workload characteristics of the das-2 supercomputer (June 2004)
Lo, V., Mache, J., Windisch, K.: A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1998. LNCS, vol. 1459, pp. 25–46. Springer, Heidelberg (1998)
Lublin, U., Feitelson, D.G.: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs. Journal of Parallel and Distributed Computing 63, 2003 (2001)
Medernach, E.: Workload Analysis of a Cluster in a Grid Environment. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 36–61. Springer, Heidelberg (2005)
Minh, T.N., Wolters, L.: Modeling Parallel System Workloads with Temporal Locality. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2009. LNCS, vol. 5798, pp. 101–115. Springer, Heidelberg (2009)
National Institute for Computational Sciences. Running jobs on Kraken, http://www.nics.tennessee.edu/node/16 (accessed November 11, 2011)
Rosenblatt, M.: Remarks on Some Nonparametric Estimates of a Density Function. The Annals of Mathematical Statistics 27(3), 832–837 (1956)
Song, B., Ernemann, C., Yahyapour, R.: Modelling of parameters in supercomputer workloads. In: International Conference on Architecture of Computing Systems, pp. 400–409 (2004)
Song, B., Ernemann, C., Yahyapour, R.: User group-based workload analysis and modelling. In: IEEE International Symposium on Cluster Computing and the Grid, vol. 2, pp. 953–961 (May 2005)
Top500. Application area share for 06/2011, http://www.top500.org/list/2011/11/100 (accessed November 11, 2011)
Tsafrir, D., Etsion, Y., Feitelson, D.G.: Modeling User Runtime Estimates. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 1–35. Springer, Heidelberg (2005)
Wolter, N., McCracken, M., Snavely, A., Hochstein, L., Nakamura, T., Basili, V.: What’s working in HPC: Investigating HPC user behavior and productivity. CT-Watch Quarterly 2(4A) (2006)
xRAC, http://www.teragridforum.org/mediawiki/index.php?title=XRAC
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
You, H., Zhang, H. (2013). Comprehensive Workload Analysis and Modeling of a Petascale Supercomputer. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2012. Lecture Notes in Computer Science, vol 7698. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35867-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-35867-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35866-1
Online ISBN: 978-3-642-35867-8
eBook Packages: Computer ScienceComputer Science (R0)