More Web Proxy on the site http://driver.im/

research-article

Perceptron learning for reuse prediction

Authors:

Daniel A. JiménezAuthors Info & Claims

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

Article No.: 2, Pages 1 - 12

Published: 15 October 2016 Publication History

Abstract

The disparity between last-level cache and memory latencies motivates the search for efficient cache management policies. Recent work in predicting reuse of cache blocks enables optimizations that significantly improve cache performance and efficiency. However, the accuracy of the prediction mechanisms limits the scope of optimization. This paper proposes perceptron learning for reuse prediction. The proposed predictor greatly improves accuracy over previous work. For multi- programmed workloads, the average false positive rate of the proposed predictor is 3.2%, while sampling dead block prediction (SDBP) and signature-based hit prediction (SHiP) yield false positive rates above 7%. The improvement in accuracy translates directly into performance. For single-thread workloads and a 4MB lastlevel cache, reuse prediction with perceptron learning enables a replacement and bypass optimization to achieve a geometric mean speedup of 6.1%, compared with 3.8% for SHiP and 3.5% for SDBP on the SPEC CPU 2006 benchmarks. On a memory-intensive subset of SPEC, perceptron learning yields 18.3% speedup, versus 10.5% for SHiP and 7.7% for SDBP. For multi- programmed workloads and a 16MB cache, the proposed technique doubles the efficiency of the cache over LRU and yields a geometric mean normalized weighted speedup of 7.4%, compared with 4.4% for SHiP and 4.2% for SDBP.

References

[1]

S. M. Khan, Y. Tian, and D. A. Jiménez, "Sampling dead block prediction for last-level caches," in MICRO, pp. 175--186, December 2010.

Digital Library

[2]

C.-J. Wu, A. Jaleel, W. Hasenplaugh, M. Martonosi, J. Simon C. Steely, and J. Emer, "SHiP: Signature-based hit predictor for high performance caching," in Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-44, (New York, NY, USA), pp. 430--441, ACM, 2011.

Digital Library

[3]

H. D. Block, "The perceptron: A model for brain functioning," Reviews of Modern Physics, vol. 34, pp. 123--135, 1962.

Digital Library

[4]

M. L. Minsky and S. A. Papert, Perceptrons, Expanded Edition. MIT Press, 1988.

Digital Library

[5]

D. A. Jiménez and C. Lin, "Dynamic branch prediction with perceptrons," in Proceedings of the 7th International Symposium on High Performance Computer Architecture (HPCA-7), pp. 197--206, January 2001.

Digital Library

[6]

D. Tarjan, K. Skadron, and M. Stan, "An ahead pipelined alloyed per-ceptron with single cycle access time," in Proceedings of the Workshop on Complexity Effective Design (WCED), June 2004.

[7]

A. Seznec, "Genesis of the o-gehl branch predictor," Journal of Instruction-Level Parallelism (JILP), vol. 7, April 2005.

[8]

G. Pekhimenko, T. Huberty, R. Cai, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, "Exploiting compressed block size as an indicator of future reuse.," in HPCA, pp. 51--63, IEEE, 2015.

[9]

A. Jaleel, K. Theobald, S. S. Jr., and J. Emer, "High performance cache replacement using re-reference interval prediction (rrip)," in Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA-37), June 2010.

Digital Library

[10]

G. Keramidas, P. Petoumenos, and S. Kaxiras, "Cache replacement based on reuse-distance prediction," in In Proceedings of the 25th International Conference on Computer Design (ICCD-2007), pp. 245--250, 2007.

[11]

M. K. Qureshi, D. N. Lynch, O. Mutlu, and Y. N. Patt, "A case for mlp-aware cache replacement," in ISCA '06: Proceedings of the 33rd annual international symposium on Computer Architecture, (Washington, DC, USA), pp. 167--178, IEEE Computer Society, 2006.

Digital Library

[12]

A.-C. Lai and B. Falsafi, "Selective, accurate, and timely self-invalidation using last-touch prediction," in International Symposium on Computer Architecture, pp. 139 -- 148, 2000.

Digital Library

[13]

A.-C. Lai, C. Fide, and B. Falsafi, "Dead-block prediction & dead-block correlating prefetchers," SIGARCH Comput. Archit. News, vol. 29, no. 2, pp. 144--154, 2001.

Digital Library

[14]

S. Somogyi, T. F. Wenisch, N. Hardavellas, J. Kim, A. Ailamaki, and B. Falsafi, "Memory coherence activity prediction in commercial workloads," in WMPI '04: Proceedings of the 3rd workshop on Memory performance issues, (New York, NY, USA), pp. 37--45, ACM, 2004.

Digital Library

[15]

Z. Hu, S. Kaxiras, and M. Martonosi, "Timekeeping in the memory system: predicting and optimizing memory behavior," SIGARCH Comput. Archit. News, vol. 30, no. 2, pp. 209--220, 2002.

Digital Library

[16]

J. Abella, A. González, X. Vera, and M. F. P. O'Boy le, "Iatac: a smart predictor to turn-off l2 cache lines," ACM Trans. Archit. Code Optim., vol. 2, no. 1, pp. 55--77, 2005.

Digital Library

[17]

H. Liu, M. Ferdman, J. Huh, and D. Burger, "Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency," in Proceedings of the IEEE/ACM International Symposium on Microarchitecture, (Los Alamitos, CA, USA), pp. 222--233, IEEE Computer Society, 2008.

Digital Library

[18]

M. Kharbutli and Y. Solihin, "Counter-based cache replacement and bypassing algorithms," IEEE Transactions on Computers, vol. 57, no. 4, pp. 433--447, 2008.

Digital Library

[19]

P. Michaud, A. Seznec, and R. Uhlig, "Trading conflict and capacity aliasing in conditional branch predictors," in Proceedings of the 24th International Symposium on Computer Architecture, pp. 292--303, June 1997.

Digital Library

[20]

G. H. Loh and D. A. Jiménez, "Reducing the power and complexity of path-based neural branch prediction," in Proceedings of the 2005 Workshop on Complexity-Effective Design (WCED'05), pp. 28--35, June 2005.

[21]

D. A. Jiménez, "Insertion and promotion for tree-based pseudolru last-level caches," in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-46, pp. 284--296, 2013.

Digital Library

[22]

E. Perelman, G. Hamerly, M. Van Biesbrouck, T. Sherwood, and B. Calder, "Using simpoint for accurate and efficient simulation," SIGMETRICS Perform. Eval. Rev., vol. 31, no. 1, pp. 318--319, 2003.

Digital Library

[23]

A. Hilton, N. Eswaran, and A. Roth, "FIESTA: A sample-balanced multi-program workload methodology," in Workshop on Modeling, Benchmarking and Simulation (MoBS), June 2009.

[24]

N. Duong, D. Zhao, T. Kim, R. Cammarota, M. Valero, and A. V. Veidenbaum, "Improving cache management policies using dynamic reuse distances," in Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO '12, (Washington, DC, USA), pp. 389--400, IEEE Computer Society, 2012.

Digital Library

[25]

M. Shah, R. Golla, G. Grohoski, P. Jordan, J. Barreh, J. Brooks, M. Greenberg, G. Levinsky, M. Luttrell, C. Olson, Z. Samoail, M. Smittle, and T. Ziaja, "Sparc t4: A dynamically threaded server-on-a-chip," IEEE Micro, vol. 32, no. 2, pp. 8--19, 2012.

Digital Library

[26]

A. Fog, "The Microarchitecture of Intel, AMD, and VIA CPUs." http://www.agner.org/optimize/microarchitecture.pdf, 2014.

[27]

D. Burger, J. R. Goodman, and A. Kagi, "The declining effectiveness of dynamic caching for general-purpose microprocessors," Technical Report 1261, 1995.

Cited By

Zhou YWang FShi ZFeng D(2024)An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage SystemsIEEE Transactions on Computers10.1109/TC.2023.332562573:1(164-177)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TC.2023.3325625
Wu NXie Y(2022)A Survey of Machine Learning for Computer Architecture and SystemsACM Computing Surveys10.1145/349452355:3(1-39)Online publication date: 3-Feb-2022
https://dl.acm.org/doi/10.1145/3494523
Cheng YZhang FHu GWang YYang HZhang GCheng ZShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Block Popularity Prediction for Multimedia Storage Systems Using Spatial-Temporal-Sequential Neural NetworksProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475495(3390-3398)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475495
Show More Cited By

Recommendations

Perceptron-based prefetch filtering
ISCA '19: Proceedings of the 46th International Symposium on Computer Architecture

Hardware prefetching is an effective technique for hiding cache miss latencies in modern processor designs. Prefetcher performance can be characterized by two main metrics that are generally at odds with one another: coverage, the fraction of baseline ...
Multiperspective reuse prediction
MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture

The disparity between last-level cache and memory latencies motivates the search for efficient cache management policies. Recent work in predicting reuse of cache blocks enables optimizations that significantly improve cache performance and efficiency. ...
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

October 2016

816 pages

General Chairs:
Wei-Chung Hsu
NTU, Taiwan
,
Chia-Lin Yang
NTU, Taiwan
,
Program Chairs:
Mikko Lipasti
Univ. Wisconsin
,
Hsien-Hsin Lee
TSMC, Taiwan

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS\DATC: IEEE Computer Society

Publisher

IEEE Press

Publication History

Published: 15 October 2016

Check for updates

Qualifiers

Research-article

Conference

MICRO-49

Sponsor:

SIGMICRO
IEEE-CS\DATC

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

October 15 - 19, 2016

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
304
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou YWang FShi ZFeng D(2024)An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage SystemsIEEE Transactions on Computers10.1109/TC.2023.332562573:1(164-177)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TC.2023.3325625
Wu NXie Y(2022)A Survey of Machine Learning for Computer Architecture and SystemsACM Computing Surveys10.1145/349452355:3(1-39)Online publication date: 3-Feb-2022
https://dl.acm.org/doi/10.1145/3494523
Cheng YZhang FHu GWang YYang HZhang GCheng ZShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Block Popularity Prediction for Multimedia Storage Systems Using Spatial-Temporal-Sequential Neural NetworksProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475495(3390-3398)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475495
Khan TZhang DSriraman ADevietti JPokam GLitz HKasikci BMartínez JDuato JJohn L(2021)RippleProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00063(734-747)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00063
Zhang YHuang PZhou KWang HHu JJi YCheng BGavrilovska AZadok E(2020)OSCAProceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference10.5555/3489146.3489200(785-798)Online publication date: 15-Jul-2020
https://dl.acm.org/doi/10.5555/3489146.3489200
Backes LJiménez D(2019)The impact of cache inclusion policies on cache management techniquesProceedings of the International Symposium on Memory Systems10.1145/3357526.3357547(428-438)Online publication date: 30-Sep-2019
https://dl.acm.org/doi/10.1145/3357526.3357547
Shi ZHuang XJain ALin C(2019)Applying Deep Learning to the Cache Replacement ProblemProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358319(413-425)Online publication date: 12-Oct-2019
https://dl.acm.org/doi/10.1145/3352460.3358319
Bhatia EChacon GPugsley STeran EGratz PJiménez DManne SHunter HAltman E(2019)Perceptron-based prefetch filteringProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322207(1-13)Online publication date: 22-Jun-2019
https://dl.acm.org/doi/10.1145/3307650.3322207
Tsai PGan YSanchez DOskin MInoue K(2018)Rethinking the memory hierarchy for modern languagesProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00025(203-216)Online publication date: 20-Oct-2018
https://dl.acm.org/doi/10.1109/MICRO.2018.00025

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents