[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Efficient simulation of caches under optimal replacement with applications to miss characterization

Published: 01 June 1993 Publication History

Abstract

Cache miss characterization models such as the three Cs model are useful in developing schemes to reduce cache misses and their penalty. In this paper we propose the OPT model that uses cache simulation under optimal (OPT) replacement to obtain a finer and more accurate characterization of misses than the three Cs model. However, current methods for optimal cache simulation are slow and difficult to use. We present three new techniques for optimal cache simulation. First, we propose a limited lookahead strategy with error fixing, which allows one pass simulation of multiple optimal caches. Second, we propose a scheme to group entries in the OPT stack, which allows efficient tree based fully-associative cache simulation under OPT. Third, we propose a scheme for exploiting partial inclusion in set-associative cache simulation under OPT. Simulators based on these algorithms were used to obtain cache miss characterizations using the OPT model for nine SPEC benchmarks. The results indicate that miss ratios under OPT are substantially lower than those under LRU replacement, by up to 70% in fully-associative caches, and up to 32% in two-way set-associative caches.

References

[1]
A. Agarwal. Analysis of Cache Performance for Operating Systems and Mu{tiprogramming. PhD thesis, Stanford University, 1988. Available as Technical Report CSL-TR-87-332.
[2]
L. A. Belady. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal, 5(2):78-101, 1966.
[3]
B. T. Bennett and V. J. Kruskal. LRU stack processing. IBM J. of Research and Development, pages 353-357, July 1975.
[4]
D. Gannon, W. Jalby, and K. GalUvan. Strategies for cache and local memory management by global program transformation. J. of Par. and Dist. Comp., 5:587-616, 1988.
[5]
J. L. Hennessy emd D. A. Patterson. Computer Architecture -- A Quantitive Approach. Morgan Kaufmann Publishers Inc., 1990.
[6]
M. D. Hill. Aspects of Cache Memory and Instruction fornia, Berkeley, 1987. Available as Technical Report UCB/CSD 87/381.
[7]
M. D. Hill and A. J. Smith. Evaluating associativity in CPU caches. IEEE Trans. on Computers, 38(12):1612- 1630, December 1989.
[8]
W. W. Hwu and P. P. Chang. Achieving high instruction cache performance with an optimizing compiler. In Proc. of 16th Intl. Syrup. on Computer Architecture, pages 242-251, 1989.
[9]
W. W. Hwu and T. M. Conte. The susceptibility of programs to context switching. Technical Report CRHC- 91-14, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, April 1991.
[10]
N. P. Jouppi. Improving direct-mapped cache performmice by the addition of & small fully-associative cache and prefetch buffets. In Proc. of 17th Intl. Syrup. on Computer Architecture, pages 364-373, 1990.
[11]
Y. H. Kim, M. D. Hill, and D. A. Wood. Implementing stack simulation for highly-associative memories. In Proc. ACM SIGMETRICS Cont., pages 212-213, 1991.
[12]
R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2):78-117, 1970.
[13]
S. McFarling. Program Analysis and Optimization for Machines with Instruction Cache. PhD thesis, Stanford University, 1988. Technical Report No. CSL-TR-91- 493.
[14]
S. McFarling. Program optimization for instruction caches. In Proceedings of ASPLOS III, 1989.
[15]
T. N. Mudge et al. The design of a microsupercomputer. IEEE Computer, pages 57-64, Jan 1991.
[16]
F. Olken. Efficient methods for calculating the success function of fixed space replacement policies. Technical Report LBL-12370, Lawrence Berkeley Laboratory, 1981.
[17]
D. D. Sleator and R. E. Taxjan. Serf adjusting binary search trees. J. of the A CM, 32(3):652-686, 1985.
[18]
H. S. Stone. High.Performance Computer Architecture. Addison-Wesley, 2"d edition, 1987.
[19]
R. A. Sugumar and S. G. Abraham. Efficient simulation of multiple cache configurations using binomial trees. Technical Report CSE-TR-111-91, CSE Division, University of Michigan, 1991.
[20]
R. A. Suguma~ and S. G. Abraham. Efficient simulation of caches under optimal replacement with applications to miss characterization. Technical Report CSE-TR- 143-92, CSE Division, University of Michigan, 1992.
[21]
D. Thiebaut. On the frsctal dimension of computer programs and its application to the prediction of the cache miss ratio. IEEE Trans. on Computers, 38(7):1012- 1026, July 1989.
[22]
J. G. Thompson. Efficient analysis of Caching Systems. PhD thesis, University of California, Berkeley, 1987.

Cited By

View all
  • (2019)A fast analytical model of fully associative cachesProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314606(816-829)Online publication date: 8-Jun-2019
  • (2019)A Survey of Computer Architecture Simulation Techniques and ToolsIEEE Access10.1109/ACCESS.2019.29176987(78120-78145)Online publication date: 2019
  • (2018)The CSI Framework for Compiler-Inserted Program InstrumentationACM SIGMETRICS Performance Evaluation Review10.1145/3292040.321965746:1(100-102)Online publication date: 12-Jun-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 21, Issue 1
June 1993
280 pages
ISSN:0163-5999
DOI:10.1145/166962
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMETRICS '93: Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
    June 1993
    286 pages
    ISBN:0897915801
    DOI:10.1145/166955
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1993
Published in SIGMETRICS Volume 21, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)8
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)A fast analytical model of fully associative cachesProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314606(816-829)Online publication date: 8-Jun-2019
  • (2019)A Survey of Computer Architecture Simulation Techniques and ToolsIEEE Access10.1109/ACCESS.2019.29176987(78120-78145)Online publication date: 2019
  • (2018)The CSI Framework for Compiler-Inserted Program InstrumentationACM SIGMETRICS Performance Evaluation Review10.1145/3292040.321965746:1(100-102)Online publication date: 12-Jun-2018
  • (2017)The CSI Framework for Compiler-Inserted Program InstrumentationProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/31545021:2(1-25)Online publication date: 19-Dec-2017
  • (2016)Concurrent memory subsystem and application optimization for ASIP design2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS)10.1109/SAMOS.2016.7818325(1-10)Online publication date: Jul-2016
  • (2014)Undersubscribed threading on clustered cache architectures2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2014.6835975(678-689)Online publication date: Feb-2014
  • (2011)Fast modeling of shared caches in multicore systemsProceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers10.1145/1944862.1944885(147-157)Online publication date: 24-Jan-2011
  • (2005)Optimizing instruction cache performance of embedded systemsACM Transactions on Embedded Computing Systems10.1145/1113830.11138394:4(934-965)Online publication date: 1-Nov-2005
  • (2004)Timestamp-Based Selective Cache AllocationHigh Performance Memory Systems10.1007/978-1-4419-8987-1_4(43-59)Online publication date: 2004
  • (2002)A cache-aware program transformation technique suitable for embedded systemsInformation and Software Technology10.1016/S0950-5849(02)00107-644:13(783-795)Online publication date: Oct-2002
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media