[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2000417.2000419acmotherconferencesArticle/Chapter ViewAbstractPublication PagesexadaptConference Proceedingsconference-collections
research-article

Contentiousness vs. sensitivity: improving contention aware runtime systems on multicore architectures

Published: 05 June 2011 Publication History

Abstract

Runtime systems to mitigate memory resource contention problems on multicore processors have recently attracted much research attention. One critical component of these runtimes is the indicators to rank and classify applications based on their contention characteristics. However, although there has been significant research effort, application contention characteristics remain not well understood and indicators have not been thoroughly evaluated.
In this paper we performed a thorough study of applications' contention characteristics to develop better indicators to improve contention-aware runtime systems. The contention characteristics are composed of an application's contentiousness, and its sensitivity to contention. We show that contentiousness and sensitivity are not strongly correlated, and contrary to prior work, a single indicator is not adequate to predict both. Also, while prior work argues that last level cache miss rate is one of the best indicators to predict an application's contention characteristics, we show that depending on the workloads, it can often be misleading. We then present prediction models that consider contention in various memory resources. Our regression analysis establishes an accurate model to predict application contentiousness. The analysis also demonstrates that performance counters alone may not be sufficient to accurately predict application sensitivity to contention. Our evaluation using SPEC CPU2006 benchmarks shows that when predicting an application's contentiousness, the linear correlation coefficient R2 of our predictor and the real measured contentiousness is 0.834, as opposed to 0.224 when using last level cache miss rate.

References

[1]
M. Banikazemi, D. Poff, and B. Abali. PAM: a novel performance/power aware meta-scheduler for multi-core systems. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, 2008.
[2]
M. Bhadauria and S. McKee. An approach to resource-aware co-scheduling for cmps. ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing, Jun 2010.
[3]
D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture. In HPCA '05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, 2005.
[4]
J. Chang and G. S. Sohi. Cooperative cache partitioning for chip multiprocessors. In ICS '07: Proceedings of the 21st annual international conference on Supercomputing, 2007.
[5]
S. Cho and L. Jin. Managing Distributed, Shared L2 Caches through OS-Level Page Allocation. MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006.
[6]
S. Eranian. What can performance counters do for memory subsystem analysis? Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'08), pages 26--30, 2008.
[7]
A. Fedorova, M. Seltzer, and M. D. Smith. Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler. In PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, 2007.
[8]
A. Herdrich, R. Illikkal, R. Iyer, D. Newell, V. Chadha, and J. Moses. Rate-based QoS techniques for cache/memory in CMP platforms. In ICS '09: Proceedings of the 23rd international conference on Supercomputing, 2009.
[9]
R. Iyer, L. Zhao, F. Guo, R. Illikkal, S. Makineni, D. Newell, Y. Solihin, L. Hsu, and S. Reinhardt. QoS policies and architecture for cache/memory in CMP platforms. In ACM SIGMETRICS Performance Evaluation Review, volume 35, 2007.
[10]
Y. Jiang, X. Shen, J. Chen, and R. Tripathi. Analysis and approximation of optimal co-scheduling on chip multiprocessors. In PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, 2008.
[11]
Y. Jiang, K. Tian, and X. Shen. Combining locality analysis with online proactive job co-scheduling in chip multiprocessors. High Performance Embedded Architectures and Compilers, page 201âĂŞ215, 2010.
[12]
S. Kim, D. Chandra, and Y. Solihin. Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture. In PACT '04: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, 2004.
[13]
R. Knauerhase, P. Brett, B. Hohlt, T. Li, and S. Hahn. Using OS Observations to Improve Performance in Multicore Systems. IEEE Micro, 28(3), 2008.
[14]
J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems. In The IEEE 14th International Symposium on High Performance Computer Architecture, pages 367--378, 2008.
[15]
J. Mars and M. L. Soffa. Synthesizing Contention. In Workshop on Binary Instrumentation and Applications, 2009.
[16]
J. Mars, L. Tang, and M. L. Soffa. Directly characterizing cross core interference through contention synthesis. In Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC '11, pages 167--176, New York, NY, USA, 2011. ACM.
[17]
J. Mars, N. Vachharajani, R. Hundt, and M. Soffa. Contention aware execution: online contention detection and response. CGO '10: Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, Apr 2010.
[18]
A. Merkel, J. Stoess, and F. Bellosa. Resource-conscious scheduling for energy efficiency on multicore processors. EuroSys '10: Proceedings of the 5th European conference on Computer systems, Apr 2010.
[19]
K. J. Nesbit, N. Aggarwal, J. Laudon, and J. E. Smith. Fair Queuing Memory Systems. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006.
[20]
K. J. Nesbit, J. Laudon, and J. E. Smith. Virtual private caches. ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture, 35(2), 2007.
[21]
M. K. Qureshi and Y. N. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006.
[22]
N. Rafique, W.-T. Lim, and M. Thottethodi. Architectural support for operating system-driven CMP cache management. PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, 2006.
[23]
L. Soares, D. Tam, and M. Stumm. Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer. In MICRO '08: Proceedings of the 2008 41st IEEE/ACM International Symposium on Microarchitecture, 2008.
[24]
G. E. Suh, S. Devadas, and L. Rudolph. A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In HPCA '02: Proceedings of the 8th International Symposium on High-Performance Computer Architecture, 2002.
[25]
K. Tian, Y. Jiang, and X. Shen. A study on optimally co-scheduling jobs of different lengths on chip multiprocessors. In CF '09: Proceedings of the 6th ACM conference on Computing frontiers, 2009.
[26]
Y. Xie and G. H. Loh. Dynamic Classification of Program Memory Behaviors in CMPs. In The 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects, 2008.
[27]
D. Xu, C. Wu, and P. Yew. On mitigating memory bandwidth contention through bandwidth-aware scheduling.... of the 19th international conference on..., Dec 2010.
[28]
X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multicore cache management. EuroSys '09: Proceedings of the 4th ACM European conference on Computer systems, 2009.
[29]
S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing shared resource contention in multicore processors via scheduling. In ASPLOS '10: Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, volume 38, 2010.

Cited By

View all
  • (2022)A Pressure-Aware Policy for Contention Minimization on Multicore SystemsACM Transactions on Architecture and Code Optimization10.1145/352461619:3(1-26)Online publication date: 25-May-2022
  • (2022)PFA: Performance and Fairness-Aware LLC Partitioning MethodAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-95391-1_44(707-721)Online publication date: 23-Feb-2022
  • (2021)LFOC+: A Fair OS-level Cache-Clustering Policy for Commodity Multicore SystemsIEEE Transactions on Computers10.1109/TC.2021.3112970(1-1)Online publication date: 2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
EXADAPT '11: Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
June 2011
73 pages
ISBN:9781450307086
DOI:10.1145/2000417
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. contention aware runtimes
  2. contentiousness vs sensitivity
  3. memory subsystems
  4. multicore processors
  5. scheduling

Qualifiers

  • Research-article

Conference

EXADAPT '11

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)A Pressure-Aware Policy for Contention Minimization on Multicore SystemsACM Transactions on Architecture and Code Optimization10.1145/352461619:3(1-26)Online publication date: 25-May-2022
  • (2022)PFA: Performance and Fairness-Aware LLC Partitioning MethodAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-95391-1_44(707-721)Online publication date: 23-Feb-2022
  • (2021)LFOC+: A Fair OS-level Cache-Clustering Policy for Commodity Multicore SystemsIEEE Transactions on Computers10.1109/TC.2021.3112970(1-1)Online publication date: 2021
  • (2021)Dynamic Resources Allocation among Collocated Applications via Reinforcement Learning2021 IEEE 6th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA)10.1109/ICCCBDA51879.2021.9442553(323-331)Online publication date: 24-Apr-2021
  • (2020)Contention-Aware Performance Prediction For Virtualized Network FunctionsProceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication10.1145/3387514.3405868(270-282)Online publication date: 30-Jul-2020
  • (2019)Compiler-support for Critical Data Persistence in NVMACM Transactions on Architecture and Code Optimization10.1145/337123616:4(1-25)Online publication date: 26-Dec-2019
  • (2019)FailAmpACM Transactions on Architecture and Code Optimization10.1145/336938116:4(1-21)Online publication date: 18-Dec-2019
  • (2019)ACTiManagerProceedings of the 20th International Middleware Conference Demos and Posters10.1145/3366627.3368114(27-28)Online publication date: 9-Dec-2019
  • (2019)A Relational Theory of LocalityACM Transactions on Architecture and Code Optimization10.1145/334110916:3(1-26)Online publication date: 20-Aug-2019
  • (2019)DICERProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337891(1-10)Online publication date: 5-Aug-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media