[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/11549468_58guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Dynamic partition of memory reference instructions – a register guided approach

Published: 30 August 2005 Publication History

Abstract

A high bandwidth L-1 data cache is essential for achieving high performance in wide-issue processors. Previous studies have shown that using multiple small single-ported caches instead of a monolithic large multi-ported one for L-1 data cache can be a scalable and inexpensive way to provide higher bandwidth. Many schemes have been proposed on how to direct the memory references to these multiple caches in order to achieve a close match to the performance of an ideal multi-ported cache. However, most previous designs seldom take dynamic data access patterns into consideration and thus suffer from access conflicts within one cache and unbalanced loads between the caches. We observe that if one can group data references defined in a program into several regions (access regions) to allow parallel accesses, then providing separate small caches (access region cache) for these regions may prove to have better performance than previous multi-cache schemes. The register-guided memory reference partition approach proposed in this paper effectively identifies these semantic regions and organizes them in multiple caches in an adaptive way to maximize concurrent accesses without incurring too much overhead. In our design, the base register number, not its content, in the memory reference instruction is used as a basic guide for instruction steering. A reassignment mechanism is applied to capture the pattern when program is moving across its access regions. In addition, a distribution mechanism is introduced to further reduce residual conflicts, which adaptively enables access regions to extend or shrink among the physical caches. Our simulations of SPEC CPU2000 benchmarks have shown that the register-guided approach can reduce the conflicts effectively, distribute memory reference instructions properly, and yield considerable performance improvement in terms of IPC.

References

[1]
V.Agarwal, M.Hrishikesh, S.Keckler, and D. Burger, "Clock rate versus IPC: The end of the road for conventional microarchitectures", ISCA-27, May 2000.
[2]
T.M.Austin and D.Burger, "The SimpleScalar Tool Set," Univ. of Wisconsin Computer Science Dept. Technical Report, No. 1342, June 1997.
[3]
T.M.Austin and D.Burger, "Billion Transistor Architectures," IEEE Computer, Vol.30, No 9, June 1997.
[4]
S.Cho, P.C.Yew and G.Lee, "Access Region Locality for High-bandwidth Processor memory System Design," Proceedings of 32nd Int'l Symposium on Microarchitecture, November 1999.
[5]
S.Cho, "A High-bandwidth Memory Pipeline for Wide Issue Processors", University of Minnesota Computer Science and Engineering Dept. Ph.D. Thesis, Dec. 2002
[6]
A.Gonzalez, M.Valero, N.Topham and J.M.Parcerisa, "Eliminating Cache Conflict Misses through XOR-Based Placement Functions", Proceedings of the 1997 Int'l Conference on Supercomputing, July 1997.
[7]
IDT. Introduction to Multi-Port Memories, Application Note AN-253, 2000.
[8]
D.Limaye, R.Rakvic and J.P.Shen, "Parallel Cachelets," 2001 Int'l Conference on Computer Design, September 2001.
[9]
M.H. Lipasti and J.P. Shen, "Supperspeculative Microarchitecture for Beyond AD 2000," IEEE Computer, Sept. 1997
[10]
H.Neefs, H.Vandierendonck, K.de Bosschere, "A Technique for High-bandwidth and Deterministic Low Latency Load/Store Accesses to Multiple Cache Banks," Int'l Symposium on High-Performance Computer Architecture, January 2000.
[11]
P. Racunas, Y. Patt, "Partitioned first-level cache design for clustered microarchitectures" Proceedings of the 26th Annual International Conference on Supercomputing, June 2003.
[12]
J.A.Rivers, G.S.Tyson, E.S.Davidson, T.M.Austin, "On High-Bandwidth Data Cache Design for Multi-issue Processors", Proceedings of Micro-30, December 1997.
[13]
P. Shivakumar and N.P.Jouppi, "CACTI 3.0: An Integrated Cache Timing, Power, and Area Model," COMPAQ WRL Research Report 2001/2, August 2000.
[14]
G.S.Sohi, M.Franklin, "High-Bandwidth Data Memory Systems for Superscalar Processors", ASPLOS-IV, April 1991.
[15]
SPEC2000, The tandard Performance Evaluation Corporation, http://www.specbench.org.
[16]
B.S.Thakar, G.Lee, "Access Region Cache: A Multi-porting Solution for Future Wide-Issue Processors", Proceedings of 2001 Int'l Conference on Computer Design, Sept. 2001.
[17]
B.S.Thakar, S.K. Park and G. Lee, "A scalable multi-porting solution for future wide-issue processors," Microprocessors and Microsystems, 2003.
[18]
Z. Wang, D. Burger, K.S.McKinley, and C. C. Weems, "Guided Region Prefetch: A Cooperative hardware/Software Approach", Proceedings of 30th ISCA, June 2003.
[19]
K.M.Wilson, K.Olukotun, M.Rosenblum, "Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors", Proceedings of 23th ISCA, May 1996.
[20]
A.Yoaz, E.Mattan, R.Ronen, S.Jourden, "Speculation Techniques for improving Load Related Instruction Scheduling", Proceedings of 26th ISCA, May 1999.

Cited By

View all
  • (2009)Access region cache with register guided memory reference partitioningJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2009.09.00255:10-12(434-445)Online publication date: 1-Oct-2009

Index Terms

  1. Dynamic partition of memory reference instructions – a register guided approach
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    Euro-Par'05: Proceedings of the 11th international Euro-Par conference on Parallel Processing
    August 2005
    1294 pages
    ISBN:3540287000
    • Editors:
    • José C. Cunha,
    • Pedro D. Medeiros

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 30 August 2005

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2009)Access region cache with register guided memory reference partitioningJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2009.09.00255:10-12(434-445)Online publication date: 1-Oct-2009

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media