[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3195638.3195641acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

Perceptron learning for reuse prediction

Published: 15 October 2016 Publication History

Abstract

The disparity between last-level cache and memory latencies motivates the search for efficient cache management policies. Recent work in predicting reuse of cache blocks enables optimizations that significantly improve cache performance and efficiency. However, the accuracy of the prediction mechanisms limits the scope of optimization. This paper proposes perceptron learning for reuse prediction. The proposed predictor greatly improves accuracy over previous work. For multi- programmed workloads, the average false positive rate of the proposed predictor is 3.2%, while sampling dead block prediction (SDBP) and signature-based hit prediction (SHiP) yield false positive rates above 7%. The improvement in accuracy translates directly into performance. For single-thread workloads and a 4MB lastlevel cache, reuse prediction with perceptron learning enables a replacement and bypass optimization to achieve a geometric mean speedup of 6.1%, compared with 3.8% for SHiP and 3.5% for SDBP on the SPEC CPU 2006 benchmarks. On a memory-intensive subset of SPEC, perceptron learning yields 18.3% speedup, versus 10.5% for SHiP and 7.7% for SDBP. For multi- programmed workloads and a 16MB cache, the proposed technique doubles the efficiency of the cache over LRU and yields a geometric mean normalized weighted speedup of 7.4%, compared with 4.4% for SHiP and 4.2% for SDBP.

References

[1]
S. M. Khan, Y. Tian, and D. A. Jiménez, "Sampling dead block prediction for last-level caches," in MICRO, pp. 175--186, December 2010.
[2]
C.-J. Wu, A. Jaleel, W. Hasenplaugh, M. Martonosi, J. Simon C. Steely, and J. Emer, "SHiP: Signature-based hit predictor for high performance caching," in Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-44, (New York, NY, USA), pp. 430--441, ACM, 2011.
[3]
H. D. Block, "The perceptron: A model for brain functioning," Reviews of Modern Physics, vol. 34, pp. 123--135, 1962.
[4]
M. L. Minsky and S. A. Papert, Perceptrons, Expanded Edition. MIT Press, 1988.
[5]
D. A. Jiménez and C. Lin, "Dynamic branch prediction with perceptrons," in Proceedings of the 7th International Symposium on High Performance Computer Architecture (HPCA-7), pp. 197--206, January 2001.
[6]
D. Tarjan, K. Skadron, and M. Stan, "An ahead pipelined alloyed per-ceptron with single cycle access time," in Proceedings of the Workshop on Complexity Effective Design (WCED), June 2004.
[7]
A. Seznec, "Genesis of the o-gehl branch predictor," Journal of Instruction-Level Parallelism (JILP), vol. 7, April 2005.
[8]
G. Pekhimenko, T. Huberty, R. Cai, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry, "Exploiting compressed block size as an indicator of future reuse.," in HPCA, pp. 51--63, IEEE, 2015.
[9]
A. Jaleel, K. Theobald, S. S. Jr., and J. Emer, "High performance cache replacement using re-reference interval prediction (rrip)," in Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA-37), June 2010.
[10]
G. Keramidas, P. Petoumenos, and S. Kaxiras, "Cache replacement based on reuse-distance prediction," in In Proceedings of the 25th International Conference on Computer Design (ICCD-2007), pp. 245--250, 2007.
[11]
M. K. Qureshi, D. N. Lynch, O. Mutlu, and Y. N. Patt, "A case for mlp-aware cache replacement," in ISCA '06: Proceedings of the 33rd annual international symposium on Computer Architecture, (Washington, DC, USA), pp. 167--178, IEEE Computer Society, 2006.
[12]
A.-C. Lai and B. Falsafi, "Selective, accurate, and timely self-invalidation using last-touch prediction," in International Symposium on Computer Architecture, pp. 139 -- 148, 2000.
[13]
A.-C. Lai, C. Fide, and B. Falsafi, "Dead-block prediction & dead-block correlating prefetchers," SIGARCH Comput. Archit. News, vol. 29, no. 2, pp. 144--154, 2001.
[14]
S. Somogyi, T. F. Wenisch, N. Hardavellas, J. Kim, A. Ailamaki, and B. Falsafi, "Memory coherence activity prediction in commercial workloads," in WMPI '04: Proceedings of the 3rd workshop on Memory performance issues, (New York, NY, USA), pp. 37--45, ACM, 2004.
[15]
Z. Hu, S. Kaxiras, and M. Martonosi, "Timekeeping in the memory system: predicting and optimizing memory behavior," SIGARCH Comput. Archit. News, vol. 30, no. 2, pp. 209--220, 2002.
[16]
J. Abella, A. González, X. Vera, and M. F. P. O'Boy le, "Iatac: a smart predictor to turn-off l2 cache lines," ACM Trans. Archit. Code Optim., vol. 2, no. 1, pp. 55--77, 2005.
[17]
H. Liu, M. Ferdman, J. Huh, and D. Burger, "Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency," in Proceedings of the IEEE/ACM International Symposium on Microarchitecture, (Los Alamitos, CA, USA), pp. 222--233, IEEE Computer Society, 2008.
[18]
M. Kharbutli and Y. Solihin, "Counter-based cache replacement and bypassing algorithms," IEEE Transactions on Computers, vol. 57, no. 4, pp. 433--447, 2008.
[19]
P. Michaud, A. Seznec, and R. Uhlig, "Trading conflict and capacity aliasing in conditional branch predictors," in Proceedings of the 24th International Symposium on Computer Architecture, pp. 292--303, June 1997.
[20]
G. H. Loh and D. A. Jiménez, "Reducing the power and complexity of path-based neural branch prediction," in Proceedings of the 2005 Workshop on Complexity-Effective Design (WCED'05), pp. 28--35, June 2005.
[21]
D. A. Jiménez, "Insertion and promotion for tree-based pseudolru last-level caches," in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-46, pp. 284--296, 2013.
[22]
E. Perelman, G. Hamerly, M. Van Biesbrouck, T. Sherwood, and B. Calder, "Using simpoint for accurate and efficient simulation," SIGMETRICS Perform. Eval. Rev., vol. 31, no. 1, pp. 318--319, 2003.
[23]
A. Hilton, N. Eswaran, and A. Roth, "FIESTA: A sample-balanced multi-program workload methodology," in Workshop on Modeling, Benchmarking and Simulation (MoBS), June 2009.
[24]
N. Duong, D. Zhao, T. Kim, R. Cammarota, M. Valero, and A. V. Veidenbaum, "Improving cache management policies using dynamic reuse distances," in Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO '12, (Washington, DC, USA), pp. 389--400, IEEE Computer Society, 2012.
[25]
M. Shah, R. Golla, G. Grohoski, P. Jordan, J. Barreh, J. Brooks, M. Greenberg, G. Levinsky, M. Luttrell, C. Olson, Z. Samoail, M. Smittle, and T. Ziaja, "Sparc t4: A dynamically threaded server-on-a-chip," IEEE Micro, vol. 32, no. 2, pp. 8--19, 2012.
[26]
A. Fog, "The Microarchitecture of Intel, AMD, and VIA CPUs." http://www.agner.org/optimize/microarchitecture.pdf, 2014.
[27]
D. Burger, J. R. Goodman, and A. Kagi, "The declining effectiveness of dynamic caching for general-purpose microprocessors," Technical Report 1261, 1995.

Cited By

View all
  • (2024)An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage SystemsIEEE Transactions on Computers10.1109/TC.2023.332562573:1(164-177)Online publication date: 1-Jan-2024
  • (2022)A Survey of Machine Learning for Computer Architecture and SystemsACM Computing Surveys10.1145/349452355:3(1-39)Online publication date: 3-Feb-2022
  • (2021)Block Popularity Prediction for Multimedia Storage Systems Using Spatial-Temporal-Sequential Neural NetworksProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475495(3390-3398)Online publication date: 17-Oct-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture
October 2016
816 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 15 October 2016

Check for updates

Qualifiers

  • Research-article

Conference

MICRO-49
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage SystemsIEEE Transactions on Computers10.1109/TC.2023.332562573:1(164-177)Online publication date: 1-Jan-2024
  • (2022)A Survey of Machine Learning for Computer Architecture and SystemsACM Computing Surveys10.1145/349452355:3(1-39)Online publication date: 3-Feb-2022
  • (2021)Block Popularity Prediction for Multimedia Storage Systems Using Spatial-Temporal-Sequential Neural NetworksProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475495(3390-3398)Online publication date: 17-Oct-2021
  • (2021)RippleProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00063(734-747)Online publication date: 14-Jun-2021
  • (2020)OSCAProceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference10.5555/3489146.3489200(785-798)Online publication date: 15-Jul-2020
  • (2019)The impact of cache inclusion policies on cache management techniquesProceedings of the International Symposium on Memory Systems10.1145/3357526.3357547(428-438)Online publication date: 30-Sep-2019
  • (2019)Applying Deep Learning to the Cache Replacement ProblemProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358319(413-425)Online publication date: 12-Oct-2019
  • (2019)Perceptron-based prefetch filteringProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322207(1-13)Online publication date: 22-Jun-2019
  • (2018)Rethinking the memory hierarchy for modern languagesProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00025(203-216)Online publication date: 20-Oct-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media