[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Using Pin as a memory reference generator for multiprocessor simulation

Published: 01 December 2005 Publication History

Abstract

In this paper we describe how we have used Pin to generate a multithreaded reference stream for simulation of a multiprocessor on a uniprocessor. We have taken special care to model as accurately as possible the effects of cache coherence protocol state, and lock and barrier synchronization on the performance of multithreaded applications running on multiprocessor hardware.We first describe a simplified version of the algorithm, which uses semaphores to synchronize instrumented application threads and the simulator. We then describe modifications to that algorithm to model the microarchitectural features of the Itanium2 that affect the timing of memory reference issue. An experimental evaluation determines that, while our methods enable accurate simulation, the use of semaphores has negative impact on the performance of the simulator.

References

[1]
E. Dijkstra. Cooperating Sequential Processes. In F. Genuys, editor, Programming Languages, pages 43--112. Academic Press, 1968.
[2]
H. Franke, R. Russell, and M. Kirkwood. Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux. In Proceedings of the Ottawa Linux Symposium, 2002.
[3]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Programming Language Design and Implementation, Chicago, IL, June 2005.
[4]
C. McNairy and D. Soltis. Itanium2 Processor Microarchitecture. IEEE Micro, Mar-April 2003.
[5]
D. Sorin, M. Plakal, A. Condon, M. Hill, M. Martin, and D. Wood. Specifying and Verifying a Broadcast and a Multi-cast Snooping Cache Coherence Protocol. IEEE Transactions on Parallel and Distributed Systems, 2002.

Cited By

View all
  • (2022)PPT-Multicore: performance prediction of OpenMP applications using reuse profiles and analytical modelingThe Journal of Supercomputing10.1007/s11227-021-03949-478:2(2354-2385)Online publication date: 1-Feb-2022
  • (2019)Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SPPLOS ONE10.1371/journal.pone.022013514:8(e0220135)Online publication date: 1-Aug-2019
  • (2017)A flexible multi-core functional cache simulator (FM-SIM)Proceedings of the Summer Simulation Multi-Conference10.5555/3140065.3140094(1-12)Online publication date: 9-Jul-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 33, Issue 5
Special issue on the 2005 workshop on binary instrumentation and application
December 2005
93 pages
ISSN:0163-5964
DOI:10.1145/1127577
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2005
Published in SIGARCH Volume 33, Issue 5

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)PPT-Multicore: performance prediction of OpenMP applications using reuse profiles and analytical modelingThe Journal of Supercomputing10.1007/s11227-021-03949-478:2(2354-2385)Online publication date: 1-Feb-2022
  • (2019)Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SPPLOS ONE10.1371/journal.pone.022013514:8(e0220135)Online publication date: 1-Aug-2019
  • (2017)A flexible multi-core functional cache simulator (FM-SIM)Proceedings of the Summer Simulation Multi-Conference10.5555/3140065.3140094(1-12)Online publication date: 9-Jul-2017
  • (2017)Using Multicore Reuse Distance to Study Coherence DirectoriesACM Transactions on Computer Systems10.1145/309270235:2(1-49)Online publication date: 28-Jul-2017
  • (2017)Optimizing locality in graph computations using reuse distance profiles2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2017.8280444(1-8)Online publication date: Dec-2017
  • (2015)Studying the impact of multicore processor scaling on directory techniques via reuse distance analysis2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2015.7056065(590-602)Online publication date: Feb-2015
  • (2013)Studying multicore processor scaling via reuse distance analysisACM SIGARCH Computer Architecture News10.1145/2508148.248596541:3(499-510)Online publication date: 23-Jun-2013
  • (2013)Studying multicore processor scaling via reuse distance analysisProceedings of the 40th Annual International Symposium on Computer Architecture10.1145/2485922.2485965(499-510)Online publication date: 23-Jun-2013
  • (2013)Efficient Reuse Distance Analysis of Multicore Scaling for Loop-Based Parallel ProgramsACM Transactions on Computer Systems (TOCS)10.1145/2427631.242763231:1(1-37)Online publication date: 1-Feb-2013
  • (2012)Identifying optimal multicore cache hierarchies for loop-based parallel programs via reuse distance analysisProceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness10.1145/2247684.2247687(2-11)Online publication date: 16-Jun-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media