[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2832087.2832091acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Techniques for modeling large-scale HPC I/O workloads

Published: 15 November 2015 Publication History

Abstract

Accurate analysis of HPC storage system designs is contingent on the use of I/O workloads that are truly representative of expected use. However, I/O analyses are generally bound to specific workload modeling techniques such as synthetic benchmarks or trace replay mechanisms, despite the fact that no single workload modeling technique is appropriate for all use cases. In this work, we present the design of IOWA, a novel I/O workload abstraction that allows arbitrary workload consumer components to obtain I/O workloads from a range of diverse input sources. Thus, researchers can choose specific I/O workload generators based on the resources they have available and the type of evaluation they wish to perform. As part of this research, we also outline the design of three distinct workload generation methods, based on I/O traces, synthetic I/O kernels, and I/O characterizations. We analyze and contrast each of these workload generation techniques in the context of storage system simulation models as well as production storage system measurements. We found that each generator mechanism offers varying levels of accuracy, flexibility, and breadth of use that should be considered before performing I/O analyses. We also recommend a set of best practices for HPC I/O workload modeling based on challenges that we encountered while performing our evaluation.

References

[1]
mdtest benchmark. http://sourceforge.net/projects/mdtest/, 2015.
[2]
A. Adelmann, R. Ryne, J. Shalf, and C. Siegerist. H5Part: A portable high performance parallel data interface for particle simulations. In Particle Accelerator Conference, 2005. PAC 2005. Proceedings of the, pages 4129--4131. IEEE, 2005.
[3]
D. W. Bauer Jr, C. D. Carothers, and A. Holder. Scalable time warp on blue gene supercomputers. In Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation, pages 35--44. IEEE Computer Society, 2009.
[4]
K. J. Bowers, B. Albright, L. Yin, B. Bergen, and T. Kwan. Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Physics of Plasmas (1994-present), 15(5):055703, 2008.
[5]
S. Byna, Y. Chen, X.-H. Sun, R. Thakur, and W. Gropp. Parallel I/O prefetching using MPI file caching and I/O signatures. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, page 44. IEEE Press, 2008.
[6]
D. Capps and W. Norcott. IOzone filesystem benchmark. http://www.iozone.org/.
[7]
P. Carns. ALCF I/O data repository. Technical Report ANL/ALCF/TM-13/1, Argonne National Laboratory (ANL), 2013.
[8]
P. Carns, K. Harms, W. Allcock, C. Bacon, S. Lang, R. Latham, and R. Ross. Understanding and improving computational science storage access through continuous characterization. ACM Transactions on Storage (TOS), 7(3):8, 2011.
[9]
P. Carns, R. Latham, R. Ross, K. Iskra, S. Lang, and K. Riley. 24/7 characterization of petascale I/O workloads. In Proceedings of 2009 Workshop on Interfaces and Architectures for Scientific Data Storage, September 2009.
[10]
P. Carns, Y. Yao, K. Harms, R. Latham, R. B. Ross, and K. Antypas. Production I/O characterization on the Cray XE6. In In Proceedings of the Cray User Group meeting 2013 (CUG 2013), 2013.
[11]
C. D. Carothers, D. Bauer, and S. Pearce. Ross: A high-performance, low-memory, modular time warp system. Journal of Parallel and Distributed Computing, 62(11):1648--1669, 2002.
[12]
C. D. Carothers, K. S. Perumalla, and R. M. Fujimoto. Efficient optimistic parallel simulations using reverse computation. ACM Transactions on Modeling and Computer Simulation (TOMACS), 9(3):224--253, 1999.
[13]
J. Cope, N. Liu, S. Lang, P. Carns, C. Carothers, and R. Ross. Codes: Enabling co-design of multilayer exascale storage architectures. In Proceedings of the Workshop on Emerging Supercomputing Technologies, 2011.
[14]
P. E. Crandall, R. A. Aydt, A. A. Chien, and D. A. Reed. Input/output characteristics of scalable parallel applications. In Proceedings of the 1995 ACM/IEEE conference on Supercomputing, page 59. ACM, 1995.
[15]
Department of Energy. CORAL. http://asc.llnl.gov/CORAL-benchmarks/, 2015.
[16]
S. Eidenbenz, M. Erazo, T. Li, and J. Liu. Toward comprehensive and accurate simulation performance prediction of parallel file systems. Technical report, Los Alamos National Laboratory (LANL), 2011.
[17]
S. Godard. Sysstat utilities home page. http://sebastien.godard.pagesperso-orange.fr/, 2015.
[18]
W. He, D. H. Du, and S. B. Narasimhamurthy. PIONEER: A solution to parallel I/O workload characterization and generation. In Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, pages 111--120. IEEE, 2015.
[19]
M. Heroux and R. Barrett. Mantevo project. https://mantevo.org/, 2015.
[20]
W.-I. Kao and R. K. Iyer. A user-oriented synthetic workload generator. In Distributed Computing Systems, 1992., Proceedings of the 12th International Conference on, pages 270--277. IEEE, 1992.
[21]
Y. Kim, R. Gunasekaran, G. M. Shipman, D. A. Dillow, Z. Zhang, and B. W. Settlemyer. Workload characterization of a leadership class storage cluster. In 5th Petascale Data Storage Workshop (PDSW), pages 1--5. IEEE, 2010.
[22]
J. Kunkel. HDTrace -- a tracing and simulation environment of application and system interaction. Hamburg. University of Hamburg-2011, 2011.
[23]
Z. Kurmas, K. Keeton, and K. Mackenzie. Synthesizing representative I/O workloads using iterative distillation. In Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003. 11th IEEE/ACM International Symposium on, pages 6--15. IEEE, 2003.
[24]
Lawrence Livermore National Laboratory. IOR benchmark. https://github.com/chaos/ior, 2015.
[25]
Lawrence Livermore National Laboratory. Lustre Monitoring Tool (Github). https://github.com/chaos/lmt, 2015.
[26]
N. Liu, C. Carothers, J. Cope, P. Carns, R. Ross, A. Crume, and C. Maltzahn. Modeling a leadership-scale storage system. In Parallel Processing and Applied Mathematics, pages 10--19. Springer, 2012.
[27]
N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. On the role of burst buffers in leadership-class storage systems. In Proceedings of 28th IEEE MSST conference, 2012.
[28]
Y. Liu, R. Figueiredo, D. Clavijo, Y. Xu, and M. Zhao. Towards simulation of parallel file system scheduling algorithms with PFSsim. In Proceedings of the 7th IEEE International Workshop on Storage Network Architectures and Parallel I/O (May 2011), 2011.
[29]
H. Luu, B. Behzad, R. Aydt, and M. Winslett. A multi-level approach for understanding I/O activity in HPC applications. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1--5, Sept 2013.
[30]
S. Méndez, D. Rexachs, and E. Luque. Modeling parallel scientific applications through their input/output phases. In Cluster Computing Workshops (CLUSTER WORKSHOPS), 2012 IEEE International Conference on, pages 7--15. IEEE, 2012.
[31]
M. P. Mesnier, M. Wachs, R. R. Sambasivan, J. Lopez, J. Hendricks, G. R. Ganger, and D. O'Hallaron. Trace: Parallel trace replay with approximate causal events. In Proceedings of the 5th USENIX Conference on File and Storage Technologies, pages 24--24, Berkeley, CA, USA, 2007. USENIX Association.
[32]
E. Molina-Estolano, C. Maltzahn, J. Bent, and S. Brandt. Building a parallel file system simulator. In Journal of Physics: Conference Series, volume 180, page 012050. IOP Publishing, 2009.
[33]
A. Núñez, J. Fernández, J. D. Garcia, F. Garcia, and J. Carretero. New techniques for simulating high performance MPI applications on large storage networks. The Journal of Supercomputing, 51(1):40--57, 2010.
[34]
P. C. Roth. Characterizing the I/O behavior of scientific applications on the Cray XT. In Proceedings of the 2nd International Workshop on Petascale Data Storage, pages 50--55, New York, NY, USA, 2007. ACM.
[35]
H. Shan, K. Antypas, and J. Shalf. Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, page 42. IEEE Press, 2008.
[36]
S. S. Shende and A. D. Malony. The tau parallel performance system. International Journal of High Performance Computing Applications, 20(2):287--311, 2006.
[37]
E. Smirni and D. A. Reed. Workload characterization of input/output intensive parallel applications. In Computer Performance Evaluation Modelling Techniques and Tools, pages 169--180. Springer, 1997.
[38]
R. Thakur, W. Gropp, and E. Lusk. Data sieving and collective I/O in ROMIO. In The Seventh Symposium on the Frontiers of Massively Parallel Computation, 1999. Frontiers' 99., pages 182--189. IEEE, 1999.
[39]
A. Uselton, M. Howison, N. J. Wright, D. Skinner, N. Keen, J. Shalf, K. L. Karavanic, and L. Oliker. Parallel I/O performance: From events to ensembles. In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pages 1--11. IEEE, 2010.
[40]
R. F. Van der Wijngaart and P. Wong. NAS parallel benchmarks version 2.4. Technical report, NAS technical report, NAS-02-007, 2002.
[41]
A. Varga et al. The OMNeT++ discrete event simulation system. In Proceedings of the European Simulation Multiconference (ESMâĂŹ2001), 2001.
[42]
J. Vetter and C. Chambreau. mpiP: Lightweight, scalable MPI profiling. 2014.
[43]
K. Vijayakumar, F. Mueller, X. Ma, and P. C. Roth. Scalable I/O tracing and analysis. In Proceedings of the 4th Annual Workshop on Petascale Data Storage, pages 26--31, New York, NY, USA, 2009. ACM.
[44]
N. Zhu, J. Chen, T.-C. Chiueh, and D. Ellard. TBBT: scalable and accurate trace replay for file server evaluation. In ACM SIGMETRICS Performance Evaluation Review, volume 33, pages 392--393. ACM, 2005.

Cited By

View all
  • (2024)Detecting interference between applications and improving the scheduling using malleable application clonesInternational Journal of High Performance Computing Applications10.1177/1094342023122089838:2(108-133)Online publication date: 10-Apr-2024
  • (2024)Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale Parallel File SystemsACM Transactions on Storage10.1145/364188520:2(1-42)Online publication date: 1-Feb-2024
  • (2024) ExDeFuture Generation Computer Systems10.1016/j.future.2023.11.013153:C(84-96)Online publication date: 16-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PMBS '15: Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems
November 2015
105 pages
ISBN:9781450340090
DOI:10.1145/2832087
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SC15
Sponsor:

Acceptance Rates

PMBS '15 Paper Acceptance Rate 9 of 22 submissions, 41%;
Overall Acceptance Rate 9 of 22 submissions, 41%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)3
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Detecting interference between applications and improving the scheduling using malleable application clonesInternational Journal of High Performance Computing Applications10.1177/1094342023122089838:2(108-133)Online publication date: 10-Apr-2024
  • (2024)Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale Parallel File SystemsACM Transactions on Storage10.1145/364188520:2(1-42)Online publication date: 1-Feb-2024
  • (2024) ExDeFuture Generation Computer Systems10.1016/j.future.2023.11.013153:C(84-96)Online publication date: 16-May-2024
  • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
  • (2022)Machine Learning Assisted HPC Workload Trace Generation for Leadership Scale Storage SystemsProceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing10.1145/3502181.3531457(199-212)Online publication date: 27-Jun-2022
  • (2022)Extracting and characterizing I/O behavior of HPC workloads2022 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER51413.2022.00037(243-255)Online publication date: Sep-2022
  • (2022)Design and implementation of dynamic I/O control scheme for large scale distributed file systemsCluster Computing10.1007/s10586-022-03640-025:6(4423-4438)Online publication date: 30-Jul-2022
  • (2021)Parallel I/O Evaluation Techniques and Emerging HPC Workloads: A Perspective2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00100(671-679)Online publication date: Sep-2021
  • (2020)Mapping and scheduling HPC applications for optimizing I/OProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392764(1-12)Online publication date: 29-Jun-2020
  • (2020)Emulating I/O Behavior in Scientific Workflows on High Performance Computing Systems2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW)10.1109/PDSW51947.2020.00011(34-39)Online publication date: Nov-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media