[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

The impact of system design parameters on application noise sensitivity

Published: 01 March 2013 Publication History

Abstract

Operating system (OS) noise, or jitter, is a key limiter of application scalability in high end computing systems. Several studies have attempted to quantify the sources and effects of system interference, though few of these studies show the influence that architectural and system characteristics have on the impact of noise at scale. In this paper, we examine the impact of three such system properties: platform balance, noisy node distribution, and the choice of collective algorithm. Using a previously-developed noise injection tool, we explore how the impact of noise varies with these platform characteristics. We provide detailed performance results that indicate that a system with relatively less network bandwidth is able to absorb more noise than a system with more network bandwidth. Our results also show that application performance can be significantly degraded by only a subset of noisy nodes. Furthermore, the placement of the noisy nodes is also important, especially for applications that make substantial use of tree-based collective communication operations. Lastly, performance results indicate that non-blocking collective operations have the ability to greatly mitigate the impact of OS interference. When combined, these results show that the impact of OS noise is not solely a property of application communication behavior, but is also influenced by other properties of the system architecture and system software environment.

References

[1]
Alam, S.R., Vetter, J.S.: An analysis of system balance requirements for scientific applications. In: ICPP '06: Proceedings of the 2006 International Conference on Parallel Processing, pp. 229-236. IEEE Computer Society, Washington (2006).
[2]
Almási, G., Heidelberger, P., Archer, C.J., Martorell, X., Erway, C.C., Moreira, J.E., Steinmacher-Burow, B., Zheng, Y.: Optimization of MPI collective communication on BlueGene/L systems. In: ICS '05: Proceedings of the 19th annual international conference on Supercomputing, New York, NY, USA, pp. 253-262. ACM Press, New York (2005).
[3]
Beckman, P., Iskra, K., Yoshii, K., Coghlan, S.: The influence of operating systems on the performance of collective operations at extreme scale. In: IEEE Conference on Cluster Computing, September (2006).
[4]
Brightwell, R., Hudson, T., Pedretti, K.T., Underwood, K.D.: SeaStar Interconnect: balanced bandwidth for scalable performance. IEEE MICRO 26(3), 41-57 (2006).
[5]
Durstenfeld, R.: Algorithm 235: random permutation. Commun. ACM 7(7), 420 (1964).
[6]
Ferreira, K.B., Brightwell, R., Bridges, P.G.: Characterizing application sensitivity to OS interference using kernel-level noise injection. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (Supercomputing'08) November (2008).
[7]
Hertel, J.E.S., Bell, R., Elrick, M., Farnsworth, A., Kerley, G., McGlaun, J., Petney, S., Silling, S., Taylor, P., Yarrington, L.: CTH: a software family for multi-dimensional shock physics analysis. In: Proceedings of the 19th International Symposium on Shock Waves, held at Marseille, France, July, pp. 377-382 (1993).
[8]
Hoefler, T., Lumsdaine, A., Rehm, W.: Implementation and performance analysis of non-blocking collective operations for MPI. In: Proceedings of the 2007 International Conference on High Performance Computing, Networking, Storage and Analysis, SC07, Nov. IEEE Computer Society/ACM, New York (2007).
[9]
Hoefler, T., Schneider, T., Lumsdaine, A.: Characterizing the influence of system noise on large-scale applications by simulation. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10), Nov. (2010).
[10]
Hoefler, T., Schneider, T., Lumsdaine, A.: Loggopsim--simulating large-scale applications in the LogGOPS model, Jun. (2010), Accepted at the ACM Workshop on Large-Scale System and Application Performance (LSAP 2010).
[11]
Jones, T., Tuel, W., Brenner, L., Fier, J., Caffrey, P., Dawson, S., Neely, R., Blackmore, R., Maskell, B., Tomlinson, P., Roberts, M.: Improving the scalability of parallel jobs by adding parallel awareness to the operating system. In: Proceedings of SC'03 (2003).
[12]
Katramatos, D., Chapin, S.J., Hillman, P., Fisk, L.A., van Dresser, D.: Cross-operating system process migration on a massively parallel processor. Technical Report CS-98-28, University of Virginia (1998).
[13]
Kerbyson, D.J., Jones, P.W.: A performance model of the Parallel Ocean Program. Int. J. High Perform. Comput. Appl. 19(3), 261-276 (2005).
[14]
Kerbyson, D.J., Alme, H.J., Hoisie, A., Petrini, F., Wasserman, H.J., Gittings, M.: Predictive performance and scalability modeling of a large-scale application. In: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, Denver, CO, pp. 37-48. ACM Press, New York (2001).
[15]
Mann, P.D.V., Mittaly, U.: Handling OS jitter on multicore multithreaded systems. In: IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, pp. 1-12. IEEE Computer Society, Washington (2009).
[16]
Moreira, J., Brutman, M., Castanos, J., Gooding, T., Inglett, T., Lieber, D., McCarthy, P., Mundy, M., Parker, J., Wallenfelt, B., Giampapa, M., Engelsiepen, T., Haskin, R.: Designing a highly-scalable operating system: The Blue Gene/L story. In: Proceedings of the 2006 ACM/IEEE International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC'06), Tampa, Florida, November (2006).
[17]
Nataraj, A., Morris, A., Malony, A.D., Sottile, M., Beckman, P.: The ghost in the machine: observing the effects of kernel operation on parallel application performance. In: Proceedings of SC'07 (2007).
[18]
Pedretti, K.T., Vaughan, C., Hemmert, K.S., Barrett, B.: Application sensitivity to link and injection bandwidth on a Cray XT4 system. In: Proceedings of the 2008 Cray User Group Annual Technical Conference, May (2008).
[19]
Petrini, F., Kerbyson, D., Pakin, S.: The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of ASCI Q. In: Proceedings of the International Conference on High-Performance Computing and Networking, Phoenix, AZ (2003).
[20]
Pjesivac-Grbovic, J., Angskun, T., Bosilca, G., Fagg, G.E., Gabriel, E., Dongarra, J.: Performance analysis of MPI collective operations. Clust. Comput. 10(2), 127-143 (2007).
[21]
Straalen, B.V., Shalf, J., Ligocki, T., Keen, N., Yan, W.-S.: Scalability challenges for massively parallel AMR applications. In: Proceedings of the International Parallel and Distributed Processing Symposium, May (2009).
[22]
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19, 49-66 (2005).
[23]
Zajcew, R., Roy, P., Black, D., Peak, C., Guedes, P., Kemp, B., LoVerso, J., Leibensperger, M., Barnett, M., Rabii, F., Netterwala, D.: An OSF/1 UNIX for Massively Parallel Multicomputers. In: Proceedings of the 1993 Winter USENIX Technical Conference, January, pp. 449-468 (1993).
[24]
Zhu, H., Goodell, D., Gropp W.i., Thakur R.: Hierarchical collectives in MPICH2. In: Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp. 325-326. Springer Berlin, Heidelberg (2009).

Cited By

View all
  • (2024)GVARP: Detecting Performance Variance on Large-Scale Heterogeneous SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00063(1-16)Online publication date: 17-Nov-2024
  • (2024)When Fewer Cores Is Faster: A Parametric Study of Undersubscription in High-Performance ComputingCluster Computing10.1007/s10586-024-04353-227:7(9123-9136)Online publication date: 1-Oct-2024
  • (2022)VaproProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508411(150-162)Online publication date: 2-Apr-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Cluster Computing
Cluster Computing  Volume 16, Issue 1
March 2013
196 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 March 2013

Author Tags

  1. Jitter
  2. Non-blocking collectives
  3. Operating systems interference
  4. System balance

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)GVARP: Detecting Performance Variance on Large-Scale Heterogeneous SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00063(1-16)Online publication date: 17-Nov-2024
  • (2024)When Fewer Cores Is Faster: A Parametric Study of Undersubscription in High-Performance ComputingCluster Computing10.1007/s10586-024-04353-227:7(9123-9136)Online publication date: 1-Oct-2024
  • (2022)VaproProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508411(150-162)Online publication date: 2-Apr-2022
  • (2019)Task PackingJournal of Parallel and Distributed Computing10.1016/j.jpdc.2019.08.003134:C(37-49)Online publication date: 1-Dec-2019
  • (2018)vSensorACM SIGPLAN Notices10.1145/3200691.317849753:1(124-136)Online publication date: 10-Feb-2018
  • (2018)Contention-aware lock scheduling for transactional databasesProceedings of the VLDB Endowment10.1145/3187009.317774011:5(648-662)Online publication date: 1-Jan-2018
  • (2018)vSensorProceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3178487.3178497(124-136)Online publication date: 10-Feb-2018
  • (2018)Contention-aware lock scheduling for transactional databasesProceedings of the VLDB Endowment10.1145/3177732.317774011:5(648-662)Online publication date: 5-Oct-2018
  • (2017)Predictive communication modeling for HPC applicationsCluster Computing10.1007/s10586-017-0821-820:3(2725-2747)Online publication date: 1-Sep-2017
  • (2016)Enabling Hybrid Parallel Runtimes Through Kernel and Virtualization SupportACM SIGPLAN Notices10.1145/3007611.289225551:7(161-175)Online publication date: 25-Mar-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media