[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2612262.2612270acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Reduction of operating system jitter caused by page reclaim

Published: 10 June 2014 Publication History

Abstract

Operating system jitter is one of the major causes of runtime overhead in applications of high performance computing. Jitter results from the execution of services by the operating system kernel, such as interrupt handling and tasklets, or the execution of various daemon processes developed in order to provide operating system services, such as memory management daemons. This execution interrupts application computations and increases their execution time. Jitter significantly affects applications where many processes or threads frequently synchronize with each other. In this paper, we investigate the impact of jitter caused by reclaiming memory pages, and propose a method for reducing the impact. The target operating system is Linux. When the Linux kernel runs out of memory, the kernel awakens a special kernel thread to reclaim memory pages that are unlikely to be used in the near future. If the kernel thread is frequently awakened, application performance is degraded because of its resource consumption. The proposed method can reclaim memory pages in advance of the kernel thread. It reclaims more pages at one time than the kernel thread, thus reducing the frequency of page reclaim and the impact of jitter. We implement a system based on the proposed method and conduct an experiment using practical weather forecast software. Results of the experiment show that the proposed method minimizes performance degradation caused by jitter.

References

[1]
H. Akkan, M. Lang, and L. M. Liebrock. Stepping Towards Noiseless Linux Environment. In Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, 2012.
[2]
P. Beckman, K. Iskra, K. Yoshii, and S. Coghlan. The Influence of Operating Systems on the Performance of Collective Operations at Extreme Scale. In Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006.
[3]
E. Betti, M. Cesati, R. Gioiosa, and F. Piermaria. A Global Operating System for HPC Clusters. In Proceedings of the 2009 IEEE International Conference on Cluster Computing, 2009.
[4]
D. Chinner and J. Higdon. Exploring High Bandwidth Filesystems on Large Systems. In Proceedings of the Ottawa Linux Symposium 2006, pages 177--191, 2006.
[5]
P. De, R. Kothari, and V. Mann. Identifying Sources of Operating System Jitter Through Fine-Grained Kernel Instrumentation. In Proceedings of the 2007 IEEE International Conference on Cluster Computing, pages 331--340, 2007.
[6]
P. De, V. Mann, and U. Mittaly. Handling OS Jitter on Multicore Multithreaded Systems. In Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009.
[7]
T. H. Dunigan. Early Experiences and Performance of the Intel Paragon. Technical Report ORNL/TM-12194, Oak Ridge National Laboratory, 1994.
[8]
K. B. Ferreira, P. Bridges, and R. Brightwell. Characterizing Application Sensitivity to OS Interference Using Kernel-Level Noise Injection. In Proceedings of SC08, 2008.
[9]
M. Giampapa, T. Gooding, T. Inglett, and R. W. Wisniewski. Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. In Proceedings of SC10, 2010.
[10]
R. Gioiosa, F. Petrini, K. Davis, and F. Lebaillif-Delamare. Analysis of System Overhead on Parallel Computers. In Proceedings of the 4th IEEE International Symposium on Signal Processing and Information Technology, pages 387--390, 2004.
[11]
GlusterFS. http://www.gluster.org/.
[12]
T. Hoefler, T. Schneider, and A. Lumsdaine. Characterizing the Influence of System Noise on Large-Scale Applications by Simulation. In Proceedings of SC10, 2010.
[13]
T. Jones. Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications. In Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers, pages 57--64, 2011.
[14]
A. Morari, R. Gioiosa, R. W. Wisniewski, F. J. Cazorla, and M. Valero. A Quantitative Analysis of OS Noise. In Proceedings of the 2011 IEEE International Parallel and Distributed Processing Symposium, pages 852--863, 2011.
[15]
S. Oral, F. Wang, G. M. Shipman, D. Dillow, R. Miller, D. Maxwell, J. Becklehimer, J. Larkin, and D. Henseler. Reducing Application Runtime Variability on Jaguar XT5. Cray User Group (CUG) Meeting, 2010.
[16]
Y. Park, E. V. Hensbergen, M. Hillenbrand, T. Inglett, B. Rosenburg, K. D. Ryu, and R. W. Wisniewski. FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous Environment. In Proceedings of the 24th International Symposium on Computer Architecture and High Performance Computing, pages 211--218, 2012.
[17]
E. Rosenthal, E. A. León, and A. T. Moody. Mitigating System Noise With Simultaneous Multi-Threading. In Proceedings of SC13, poster session, 2013.
[18]
P. Schwan. Lustre: Building a File System for 1,000-node Clusters. In Proceedings of the 2003 Linux Symposium, 2003.
[19]
S. Seelam, L. Fong, J. Lewars, J. Divirgilio, B. F. Veale, and K. Gildea. Characterization of System Services and Their Performance Impact in Multi-core Nodes. In Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium, pages 104--117, 2011.
[20]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Distributed File System. In Proceedings of the 26th IEEE Symposium on Massive Storage Systems and Technologies, 2010.
[21]
S. Sumimoto. Performance Evaluation of FEFS on K Computer and Fujitsu's Roadmap toward Lustre 2.x. Lustre User Group 2013, 2013.
[22]
O. Tatebe, K. Hiraga, and N. Soda. Gfarm Grid File System. New Generation Computing, 28(3):257--275, 2010.
[23]
D. Tsafrir, Y. Etsion, D. G. Feitelson, and S. Kirkpatrick. System Noise, OS Clock Ticks, and Fine-Grained Parallel Applications. In Proceedings of the 19th ACM International Conference on Supercomputing, pages 303--312, 2005.
[24]
E. Vicente and R. M. Jr. Exploratory Study on the Linux OS Jitter. In Proceedings of the 2012 Brazilian Symposium on Computing System Engineering, pages 19--24, 2012.
[25]
WRF. http://www.wrf-model.org/.
[26]
Q. Yuan, J. Zhao, M. Chen, and N. Sun. GenerOS: An Asymmetric Operating System Kernel for Multi-core Systems. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium, 2010.

Cited By

View all
  • (2024)Combining buffered I/O and direct I/O in distributed file systemsProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650699(17-34)Online publication date: 27-Feb-2024
  • (2021)Online Working Set Change Detection with Constant ComplexityProceedings of the International Symposium on Memory Systems10.1145/3488423.3519332(1-16)Online publication date: 27-Sep-2021
  • (2018)System Software for Data-Intensive ScienceAdvanced Software Technologies for Post-Peta Scale Computing10.1007/978-981-13-1924-2_6(99-120)Online publication date: 7-Dec-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ROSS '14: Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers
June 2014
76 pages
ISBN:9781450329507
DOI:10.1145/2612262
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • SPCL: Scalable Parallel Computing Laboratory

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. file I/O
  2. high performance computing
  3. memory management
  4. operating systems
  5. page cache

Qualifiers

  • Research-article

Conference

ROSS '14
Sponsor:
  • SPCL

Acceptance Rates

ROSS '14 Paper Acceptance Rate 9 of 16 submissions, 56%;
Overall Acceptance Rate 58 of 169 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Combining buffered I/O and direct I/O in distributed file systemsProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650699(17-34)Online publication date: 27-Feb-2024
  • (2021)Online Working Set Change Detection with Constant ComplexityProceedings of the International Symposium on Memory Systems10.1145/3488423.3519332(1-16)Online publication date: 27-Sep-2021
  • (2018)System Software for Data-Intensive ScienceAdvanced Software Technologies for Post-Peta Scale Computing10.1007/978-981-13-1924-2_6(99-120)Online publication date: 7-Dec-2018
  • (2017)Energy-Performance Modeling of Speculative Checkpointing for Exascale SystemsIEICE Transactions on Information and Systems10.1587/transinf.2017PAP0002E100.D:12(2749-2760)Online publication date: 2017
  • (2016)Experimental analysis of operating system jitter caused by page reclaimThe Journal of Supercomputing10.1007/s11227-016-1703-172:5(1946-1972)Online publication date: 1-May-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media