[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2248418.2248432acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article

Improving dynamic prediction accuracy through multi-level phase analysis

Published: 12 June 2012 Publication History

Abstract

Phase analysis, which classifies the set of execution intervals with similar execution behavior and resource requirements, has been widely used in a variety of dynamic systems, including dynamic cache reconfiguration, prefetching and race detection. While phase granularity has been a major factor to the accuracy of phase prediction, it has not been well investigated yet and most dynamic systems usually adopt a fine-grained prediction scheme. However, such a scheme can only take account of recent local phase information and could be frequently interfered by temporary noises due to instant phase changes, which might notably limit the prediction accuracy.
In this paper, we make the first investigation on the potential of multi-level phase analysis (MLPA), where different granularity phase analysis are combined together to improve the overall accuracy. The key observation is that a coarse-grained interval, which usually consists of stably-distributed fine-grained intervals, can be accurately identified based on the fine-grained intervals at the beginning of its execution. Based on the observation, we design and implement a MLPA scheme. In such a scheme, a coarse-grained phase is first identified based on the fine-grained intervals at the beginning of its execution. The following fine-grained phases in it are then predicted based on the sequence of fine-grained phases in the coarse-grained phase. Experimental results show such a scheme can notably improve the prediction accuracy. Using Markov fine-grained phase predictor as the baseline, MLPA can improve prediction accuracy by 20%, 39% and 29% for next phase, phase change and phase length prediction for SPEC2000 accordingly, yet incur only about 2% time overhead and 40% space overhead (about 360 bytes in total).
To demonstrate the effectiveness of MLPA, we apply it to a dynamic cache reconfiguration system which dynamically adjusts the cache size to reduce the power consumption and access time of data cache. Experimental results show that MLPA can further reduce the average cache size by 15% compared to the fine-grained scheme.

References

[1]
R. Balasubramonian, D. H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture, pages 245--257, 2000.
[2]
I.-C. K. Chen, J. T. Coffey, and T. N. Mudge. Analysis of branch prediction via data compression. In Proceedings of the international conference on Architectural support for programming languages and operating systems, pages 128--137, 1996.
[3]
C.-B. Cho and T. Li. Complexity-based program phase analysis and classification. In Proceedings of the international conference on Parallel architectures and compilation techniques, 2006.
[4]
P. J. Denning and S. C. Schwartz. Properties of the working-set model. Communications of the ACM, 15(3):191--198, 1972.
[5]
A. S. Dhodapkar and J. E. Smith. Managing multi-configuration hardware via dynamic working set analysis. In Proceedings of the International Symposium on Computer Architecture, pages 233--244, 2002.
[6]
A. S. Dhodapkar and J. E. Smith. Comparing program phase detection techniques. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture, 2003.
[7]
E. Duesterwald, C. Cascaval, and S. Dwarkadas. Characterizing and predicting program behavior and its variability. In Proceedings of the international conference on Parallel architectures and compilation techniques, page 220, 2003.
[8]
A. Georges, D. Buytaert, L. Eeckhout, and K. D. Bosschere. Method-level phase behavior in java workloads. In Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 270--287, 2004.
[9]
M. J. Hind, V. T. Rajan, and P. F. Sweeney. Phase shift detection: A problem classification. Technical report, IBM, 2003.
[10]
M. Huang, J. Renau, and J. Torrellas. Positional Adaptation of Processors: Application to Energy Reduction. In Proceedings of the International Symposium on Computer Architecture, 2003.
[11]
T. Huffmire and T. Sherwood. Wavelet-based phase classification. In Proceedings of the international conference on Parallel architectures and compilation techniques, pages 95--104, 2006.
[12]
C. Isci and M. Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture, 2003.
[13]
C. Isci and M. Martonosi. Phase characterization for power: evaluating control-flow-based and event-counter-based techniques. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture, pages 133--144, 2006.
[14]
D. Joseph and D. Grunwald. Prefetching using markov predictors. In Proceedings of the International Symposium on Computer Architecture, 1997.
[15]
J. Lau, S. Schoenmackers, and B. Calder. Structures for phase classification. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2004.
[16]
J. Lau, E. Perelman, G. Hamerly, T. Sherwood, and B. Calder. Motivation for variable length intervals and hierarchical phase behavior. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, pages 135--146, 2005.
[17]
J. Lau, S. Schoenmackers, and B. Calder. Transition phase classification and prediction. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2005.
[18]
J. Lau, E. Perelman, and B. Calder. Selecting software phase markers with code structure analysis. In Proceedings of the International Symposium on Code Generation and Optimization, pages 135--146, 2006.
[19]
J. Lu, H. Chen, R. Fu, W.-C. Hsu, B. Othmer, P.-C. Yew, and D.-Y. Chen. The performance of runtime data cache prefetching in a dynamic optimization system. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture, pages 180--190, 2003.
[20]
D. Marino, M. Musuvathi, and S. Narayanasamy. Literace: effective sampling for lightweight data-race detection. In Proceedings of the ACM SIGPLAN conference on Programming language design and implementation, pages 134--143, 2009.
[21]
A. A. Nair and L. Joh. Simulation points for spec 2006. In IEEE International Conference on Computer Design, pages 38--46, 2008.
[22]
E. Perelman, G. Hamerly, and B. Calder. Picking statistically valid and early simulation points. In Proceedings of the international conference on Parallel architectures and compilation techniques, 2003.
[23]
A. Phansalkar, A. Joshi, L. Eeckhout, and L. John. Measuring program similarity: Experiments with spec cpu benchmark suites. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, pages 10--20, 2005.
[24]
X. Shen, Y. Zhong, and C. Ding. Locality phase prediction. In Proceedings of the international conference on Architectural support for programming languages and operating systems, pages 165--176, 2004.
[25]
T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proceedings of the international conference on Architectural support for programming languages and operating systems, 2002.
[26]
T. Sherwood, S. Sair, and B. Calder. Phase tracking and prediction. In Proceedings of the International Symposium on Computer Architecture, pages 336--349, 2003.

Cited By

View all
  • (2024)Beyond Time-Quantum: A Basic-Block FDA Approach for Accurate System Computing Performance Estimation2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473915(698-703)Online publication date: 22-Jan-2024
  • (2019)Hardware Counters’ Space Reduction for Code Region CharacterizationEuro-Par 2019: Parallel Processing10.1007/978-3-030-29400-7_6(74-86)Online publication date: 13-Aug-2019
  • (2018)A Loop-Based Methodology for Reducing Computational Redundancy in Workload SetsIEEE Access10.1109/ACCESS.2017.27889216(9570-9584)Online publication date: 2018
  • Show More Cited By

Index Terms

  1. Improving dynamic prediction accuracy through multi-level phase analysis

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    LCTES '12: Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
    June 2012
    153 pages
    ISBN:9781450312127
    DOI:10.1145/2248418
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 47, Issue 5
      LCTES '12
      MAY 2012
      152 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2345141
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cache reconfiguration
    2. dynamic prediction
    3. multi-level
    4. phase analysis

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    LCTES '12

    Acceptance Rates

    Overall Acceptance Rate 116 of 438 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Beyond Time-Quantum: A Basic-Block FDA Approach for Accurate System Computing Performance Estimation2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473915(698-703)Online publication date: 22-Jan-2024
    • (2019)Hardware Counters’ Space Reduction for Code Region CharacterizationEuro-Par 2019: Parallel Processing10.1007/978-3-030-29400-7_6(74-86)Online publication date: 13-Aug-2019
    • (2018)A Loop-Based Methodology for Reducing Computational Redundancy in Workload SetsIEEE Access10.1109/ACCESS.2017.27889216(9570-9584)Online publication date: 2018
    • (2014)DAPsProceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593116(1-6)Online publication date: 1-Jun-2014

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media