[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Dead-block prediction & dead-block correlating prefetchers

Published: 01 May 2001 Publication History

Abstract

Effective data prefetching requires accurate mechanisms to predict both “which” cache blocks to prefetch and “when” to prefetch them. This paper proposes the Dead-Block Predictors (DBPs), trace-based predictors that accurately identify “when” an Ll data cache block becomes evictable or “dead”. Predicting a dead block significantly enhances prefetching lookahead and opportunity, and enables placing data directly into Ll, obviating the need for auxiliary prefetch buffers. This paper also proposes Dead-Block Correlating Prefetchers (DBCPs), that use address correlation to predict “which” subsequent block to prefetch when a block becomes evictable. A DBCP enables effective data prefetching in a wide spectrum of pointer-intensive, integer, and floating-point applications.
We use cycle-accurate simulation of an out-of-order superscalar processor and memory-intensive benchmarks to show that: (1) dead-block prediction enhances prefetching lookahead at least by an order of magnitude as compared to previous techniques, (2) a DBP can predict dead blocks on average with a coverage of 90% only mispredicting 4% of the time, (3) a DBCP offers an address prediction coverage of 86% only mispredicting 3% of the time, and (4) DBCPs improve performance by 62% on average and 282% at best in the benchmarks we studied.

References

[1]
Jean-Loup Baer and Tien-Fu Chen. Dynamic improvements of locality in virtual memory systems, IEEE Transactions on Software Engineering, March 1976.
[2]
Martin C. Carlisle, Anne Rogers, John H. Reppy, and Laurie J. Hendren. Early experiences with Olden. In Proceedings of the Sixth Languages and Compilers for Parallel Computing, pages 1-20. Springer-Verlag, 1994.
[3]
Mark J. Charney and Anthony P. Reeves. Generalized correlation-based hardware prefetching. Technical Report EE- CEG-95-1, School of Electrical Engineering, Cornell University, February 1995.
[4]
Tien-Fu Chen and Jean-Loup Baer. Reducing memory latency via non-blocking and prefetching caches. In Proceedings of the Fifth International Conference on Architectural Support for Programmbzg Languages and Operating Systems (ASPLOS V), pages 51-61, October 1992. Also available as U. Washington CS TR 92-06-03.
[5]
Doug Joseph and Dirk Grunwald. Prefetching using Markov Predictors. IEEE Transactions on Computers, 48(2):121- 133, February 1999.
[6]
Norman P. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th Annual hlternational Symposium on Computer Architecture, pages 364-373, May 1990.
[7]
An-Chow Lai and Babak Falsafi. Selective, accurate, and timely self-invalidation using last-touch prediction. In Proceedings of the 27th Annual h2ternational Symposium on Computer Architecture, June 2000.
[8]
Mikko H. Lipasti, William J. Schmidt, Steven R. Kunkel, and Robert R. Roediger. Spaid: Software prefetching in pointerand call-intensive environments. In Proceedings of the 28th Annual IEEE/ACM blternational Symposium on Microarchitecture (MICRO 28), pages 231-236, November 1995.
[9]
Chi-Keung Luk and Todd C. Mowry. Compiler based prefetching for recursive data structures. In Proceedings of the Seventh bzternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), pages 222-233, October 1996.
[10]
Sharad Mehrotra and Luddy Harrison. Examination of a memory access classification scheme for pointer-intensive and numeric programs. In Proceedings of the 1996 International Conference on Supercomputing, pages 133-139, May 1996.
[11]
Abraham Mendelson, Dominique Thi'ebaut, and Dhiraj Pradhan. Modeling live and dead lines in cache memory systems. Technical Report TR-90-CSE- 14, Department of Electrical and Computer Engineering, University of Massachusetts, 1990.
[12]
Todd C. Mowry, Monica S. Lam, and Anoop Gupta. Desiga and evaluation of a compiler algorithm for prefetching. In, Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS V), pages 62-73, October 1992.
[13]
Ravi Nair. Dynamic path-based branch prediction. In Proceedings of the 29th Annual IEEE/A CM International Symposium on Microarchitecture (MICRO 29), pages 142-1521, December 1996.
[14]
Toshihiro Ozawa, Yasunori Kimura, and Shin'ichiro Nishizaki. Cache miss heuristics and preloading techniques for general-purpose programs. In Proceedings of the 28th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 28), November 1995.
[15]
Subbarao Palacharla and Richard E. Kessler. Evaluating stream buffers as a secondary cache placement. In Proceedings of the 21st Annual h~ternational Symposium on Computer Architecture, pages 24-33, April 1994.
[16]
Jih-Kwon Peir, Yongjoon Lee, and Windsor W. Hsu. Capturing dynamic memory reference behavior with adaptive cache topology. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII), pages 240-250, October 1998.
[17]
Amir Roth, Andreas Moshovos, and Gurindar S. Sohi. Dependence based prefetching for linked data structures. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII), October 1998.
[18]
Alan J. Smith. Cache memories. ACM Computing Surveys, 14(3):473-530, 1982.
[19]
Gurindar S. Sohi and Manoj Franklin. High-bandwidth data memory systems for superscalar processors. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV), pages 53-62, April 1991.
[20]
David A. Wood, Mark D. Hill, and R. E. Kessler. A model for estimating trace-sample miss ratios. In Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 79-89, May 1991.

Cited By

View all
  • (2023)Last-Level Cache Insertion and Promotion Policy in the Presence of Aggressive PrefetchingIEEE Computer Architecture Letters10.1109/LCA.2023.324217822:1(17-20)Online publication date: 1-Jan-2023
  • (2022)Write-awareness prefetching for non-volatile cache in energy-constrained IoT deviceIEICE Electronics Express10.1587/elex.19.2021049919:3(20210499-20210499)Online publication date: 10-Feb-2022
  • (2022)Dynamic Set Stealing to Improve Cache Performance2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD55451.2022.00017(60-70)Online publication date: Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 29, Issue 2
Special Issue: Proceedings of the 28th annual international symposium on Computer architecture (ISCA '01)
May 2001
262 pages
ISSN:0163-5964
DOI:10.1145/384285
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture
    June 2001
    289 pages
    ISBN:0769511627
    DOI:10.1145/379240

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2001
Published in SIGARCH Volume 29, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)61
  • Downloads (Last 6 weeks)16
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Last-Level Cache Insertion and Promotion Policy in the Presence of Aggressive PrefetchingIEEE Computer Architecture Letters10.1109/LCA.2023.324217822:1(17-20)Online publication date: 1-Jan-2023
  • (2022)Write-awareness prefetching for non-volatile cache in energy-constrained IoT deviceIEICE Electronics Express10.1587/elex.19.2021049919:3(20210499-20210499)Online publication date: 10-Feb-2022
  • (2022)Dynamic Set Stealing to Improve Cache Performance2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD55451.2022.00017(60-70)Online publication date: Nov-2022
  • (2022)Applying machine learning to enhance the cache performance using reuse distanceEvolutionary Intelligence10.1007/s12065-022-00730-116:4(1195-1216)Online publication date: 27-May-2022
  • (2021)LPE: Locality-Based Dead Prediction in Exclusive TLB for Large CoverageJournal of Circuits, Systems and Computers10.1142/S021812662150292330:16Online publication date: 28-Jun-2021
  • (2021)Dead Page and Dead Block Predictors: Cleaning TLBs and Caches Together2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00050(507-519)Online publication date: Feb-2021
  • (2021)SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systemsDesign Automation for Embedded Systems10.1007/s10617-021-09251-z25:3(193-211)Online publication date: 1-Sep-2021
  • (2020)RedCacheProceedings of the 57th ACM/EDAC/IEEE Design Automation Conference10.5555/3437539.3437695(1-6)Online publication date: 20-Jul-2020
  • (2020)A Categorical Study on Cache Replacement Policies for Hierarchical Cache MemoryApplications of Internet of Things10.1007/978-981-15-6198-6_19(201-211)Online publication date: 4-Aug-2020
  • (2019)ECAP: energy-efficient caching for prefetch blocks in tiled chip multiprocessorsIET Computers & Digital Techniques10.1049/iet-cdt.2019.0035Online publication date: 30-Apr-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media