[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Dynamic access distance driven cache replacement

Published: 18 October 2011 Publication History

Abstract

In this article, we propose a new cache replacement policy that makes the replacement decision based on the reuse information of the cache lines and the requested data. We present the architectural support and evaluate the performance of our approach using SPEC benchmarks. We also develop two reuse information predictors: a profile-based static predictor and a runtime predictor. The applicability of each predictor is discussed in this paper. We further extend our reuse information predictors so that the cache can adaptively choose between the reuse information based replacement policy and an approximation of LRU policy. According to the experimental results, our adaptive reuse information based replacement policy performs either better than or close to the LRU policy. Our experiments show that L2 cache misses are reduced by 12.32% and 19.95% using the profiling-based static and runtime adaptive predictors respectively.

References

[1]
Alameldeen, A. R. and Wood, D. A. 2004. Adaptive cache compression for high-performance processors. In Proceedings of the Annual International Symposium on Computer Architecture.
[2]
Belady, L. A. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 5, 2, 78--101.
[3]
Beyls, K. and D'Hollander, E. 2002. Reuse distance-based cache hint selection. In Proceedings of the International Euro-Par Conference on Parallel Processing.
[4]
Beyls, K. and D'Hollander, E. 2005. Generating cache hints for improved program efficiency. J. Syst. Archit. 51, 4, 223--250.
[5]
Eickemeyer, R. J. and Vassiliadis, S. 1993. A load instruction unit for pipelined processors. IBM J. Resear. Devel. 547--564.
[6]
Etsion, Y. and Feitelson, D. G. 2007. L1 cache filtering through random selection of memory references. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques.
[7]
Gao, H. and Wilkerson, C. 2010. A dueling segmented LRU replacement algorithm with adaptive bypassing. In Proceedings of the JILP Workshop on Computer Architecture Competitions.
[8]
Gonzalez, A., Aliagas, C., and Valero, M. 1995. A data cache with multiple caching strategies tuned to different types oflocality. In Proceedings of the International Conference on Supercomputing.
[9]
Gu, X., Bai, T., Gao, Y., Zhang, C., Archambault, R., and Ding, C. 2008. P-opt: Program-directed optimal cache management. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing. 217--231.
[10]
Hardavellas, N., Somogyi, S., Wenisch, T. F., Wunderlich, R. E., Chen, S., Kim, J., Falsafi, B., Hoe, J. C., and Nowatzyk, A. G. 2004. Simflex: a fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture. SIGMETRICS Eval. Rev. 31, 4, 31--34.
[11]
HP. 2002. Inside the Intel Itanium 2 processor. Tech. white paper, HP.
[12]
Jaleel, A., Hasenplaugh, W., Qureshi, M. K., Sebot, J., Stelly, S., and Emer, J. 2008. Adaptive insertion policies for matching shared caches. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques.
[13]
Jaleel, A., Theobald, K. B., Steely, JR., S. C., and Emer, J. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of the Annual International Symposium on Computer Architecture. 60--71.
[14]
Johnson, T. L. 1998. Run-time adaptive cache management. PhD thesis, University of Illinois, Urbana, IL.
[15]
Kaxiras, S., Hu, Z., and Martonosi, M. 2001. Cache decay: exploiting generational behavior to reduce cache leakage power. In Proceedings of the Annual International Symposium on Computer Architecture.
[16]
Keramidas, G., Petoumenos, P., and Kaxiras, S. 2007. Cache replacement based on reuse-distance prediction. In Proceedings of the IEEE International Conference on Computer Design.
[17]
Khan, S. M., Tian, Y., and Jimenez, D. A. 2010. Sampling dead block prediction for last-level caches. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture. 175--186.
[18]
Kharbutli, M. and Solihin, Y. 2005. Counter-based cache replacement algorithms. In Proceedings of the IEEE International Conference on Computer Design.
[19]
Lai, A., Fide, C., and Falsafi, B. 2001. Dead-block prediction & dead-block correlating prefetchers. In Proceedings of the Annual International Symposium on Computer Architecture.
[20]
Lin, W. and Reinhardt, S. 2002. Predicting last-touch references under optimal replacement. Tech. rep. CSE-TR-447-02, University of Michigan.
[21]
Lipasti, M. H., Wilkerson, C. B., and Shen, J. P., 1996. Value locality and load value prediction. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 138--147.
[22]
Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., and Werner, B. 2002. Simics: A full system simulation platform. Computer 35, 50--58.
[23]
Mattson, R. L., Gecsei, J., Slutz, D., and Traiger, I. L. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2, 78--117.
[24]
McFarling, S. 1992. Cache replacement with dynamic exclusion. In Proceedings of the Annual International Symposium on Computer Architecture. 191--200.
[25]
Qureshi, M. K, Jaleel, A., Patt, Y. N., Steely, S. C., and Emer, J. 2007. Adaptive insertion policies for high performance caching. In Proceedings of the Annual International Symposium on Computer Architecture.
[26]
Rajan, K and Ramaswamy, G. 2007. Emulating optimal replacement with a shepherd cache. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture. 445--454.
[27]
Schlansker, M. and Rau, B. 2000. Epic: Explicitly parallel instruction computing. Computer 33, 2, 37--45.
[28]
Thoziyoor, S., Ahn, J., Monchiero, M., Brockman, J., and Jouppi, N. 2008. A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. In Proceedings of the Annual International Symposium on Computer Architecture.
[29]
Tyson, G., Farrens, M., Matthews, J., and Pleszkun, A. R. 1995. A modified approach to data cache management, In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture.
[30]
Wang, Z., McKinley, K. S., Rosenberg, A. L., and Weems, C. C. 2002. Using the compiler to improve cache replacement decisions. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques.
[31]
Wong, W. A. and Baer, J.-L. 2000. Modified LRU policies for improving second-level cache behavior. In Proceedings of the International Symposium on High-Performance Computer Architecture.
[32]
Wunderlich, R. E., Wenisch, T. F., Falsafi, B., and Hoe, J. C. 2003. SMARTS: Accelerating microarchitecture simulation via rigorous statistical sampling. In Proceedings of the Annual International Symposium on Computer Architecture.

Cited By

View all

Index Terms

  1. Dynamic access distance driven cache replacement

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 3
    October 2011
    209 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/2019608
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 October 2011
    Accepted: 01 July 2011
    Revised: 01 October 2010
    Received: 01 January 2010
    Published in TACO Volume 8, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Cache replacement policy
    2. L2 Cache
    3. value prediction

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)64
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Evolutionary design of the memory subsystemApplied Soft Computing10.1016/j.asoc.2017.09.04762(1088-1101)Online publication date: Jan-2018
    • (2016)Multi-objective optimization of energy consumption and execution time in a single level cache memory for embedded systemsJournal of Systems and Software10.1016/j.jss.2015.10.012111:C(200-212)Online publication date: 1-Jan-2016
    • (2015)SLIPACM SIGARCH Computer Architecture News10.1145/2872887.275039843:3S(349-361)Online publication date: 13-Jun-2015
    • (2015)SLIPProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750398(349-361)Online publication date: 13-Jun-2015
    • (2013)PacmanACM SIGPLAN Notices10.1145/2555670.246648248:11(39-50)Online publication date: 20-Jun-2013
    • (2013)PacmanProceedings of the 2013 international symposium on memory management10.1145/2491894.2466482(39-50)Online publication date: 20-Jun-2013
    • (2013)PacmanProceedings of the 2013 international symposium on memory management10.1145/2464157.2466482(39-50)Online publication date: 20-Jun-2013
    • (2013)Platform-independent analysis of function-level communication in workloads2013 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC.2013.6704685(196-206)Online publication date: Sep-2013
    • (2012)A generalized theory of collaborative cachingACM SIGPLAN Notices10.1145/2426642.225901247:11(109-120)Online publication date: 15-Jun-2012
    • (2012)A generalized theory of collaborative cachingProceedings of the 2012 international symposium on Memory Management10.1145/2258996.2259012(109-120)Online publication date: 15-Jun-2012
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media