[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications

Published: 13 March 2010 Publication History

Abstract

Instruction-grain lifeguards monitor the events of a running application at the level of individual instructions in order to identify and help mitigate application bugs and security exploits. Because such lifeguards impose a 10-100X slowdown on existing platforms, previous studies have proposed hardware designs to accelerate lifeguard processing. However, these accelerators are either tailored to a specific class of lifeguards or suitable only for monitoring singlethreaded programs.
We present ParaLog, the first design of a system enabling fast online parallel monitoring of multithreaded parallel applications. ParaLog supports a broad class of software-defined lifeguards. We show how three existing accelerators can be enhanced to support online multithreaded monitoring, dramatically reducing lifeguard overheads. We identify and solve several challenges in monitoring parallel applications and/or parallelizing these accelerators, including (i) enforcing inter-thread data dependences, (ii) dealing with inter-thread effects that are not reflected in coherence traffic, (iii) dealing with unmonitored operating system activity, and (iv) ensuring lifeguards can access shared metadata with negligible synchronization overheads. We present our system design for both Sequentially Consistent and Total Store Ordering processors. We implement and evaluate our design on a 16 core simulated CMP, using benchmarks from SPLASH-2 and PARSEC and two lifeguards: a data-flow tracking lifeguard and a memory-access checker lifeguard. Our results show that (i) our parallel accelerators improve performance by 2-9X and 1.13-3.4X for our two lifeguards, respectively, (ii) we are 5-126X faster than the time-slicing approach required by existing techniques, and (iii) our average overheads for applications with eight threads are 51% and 28% for the two lifeguards, respectively.

References

[1]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, 2008.
[2]
D. Bruening. Efficient, Transparent, and Comprehensive Runtime Code Manipulation. PhD thesis, MIT, 2004.
[3]
W. R. Bush, J. D. Pincus, and D. J. Sielaff. A static analyzer for finding dynamic programming errors. Software -- Practice and Experience, 30(7), 2000.
[4]
S. Chen, B. Falsafi, P. B. Gibbons, M. Kozuch, T. C. Mowry, R. Teodorescu, A. Ailamaki, L. Fix, G. R. Ganger, B. Lin, and S. W. Schlosser. Log-based architectures for general-purpose monitoring of deployed code. In ASID Workshop at ASPLOS, 2006.
[5]
S. Chen, M. Kozuch, P. B. Gibbons, M. Ryan, T. Strigkos, T. C. Mowry, O. Ruwase, E. Vlachos, B. Falsafi, and V. Ramachandran. Flexible hardware acceleration for instruction-grain lifeguards. IEEE Micro, 29(1):62--72, 2009. Top Picks from the 2008 Computer Architecture Conferences.
[6]
S. Chen, M. Kozuch, T. Strigkos, B. Falsafi, P. B. Gibbons, T. C. Mowry, V. Ramachandran, O. Ruwase, M. Ryan, and E. Vlachos. Flexible hardware acceleration for instruction-grain program monitoring. In ISCA, 2008.
[7]
J. Chung, M. Dalton, H. Kannan, and C. Kozyrakis. Thread-safe dynamic binary translation using transactional memory. In HPCA, 2008.
[8]
M. L. Corliss, E. C. Lewis, and A. Roth. DISE: A programmable macro engine for customizing applications. In ISCA, 2003.
[9]
J. R. Crandall and F. T. Chong. Minos: Control data attack prevention orthogonal to memory model. In MICRO, 2004.
[10]
M. Dalton, H. Kannan, and C. Kozyrakis. Raksha: A flexible information flow architecture for software security. In ISCA, 2007.
[11]
D. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system--specific, programmer-written compiler extensions. In OSDI, 2000.
[12]
M. D. Ernst, J. Cockrell,W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Trans. Software Engineering, 27(2), 2001.
[13]
C. Flanagan, K. R. M. Leino, M. Lillibridge, G. Nelson, J. B. Saxe, and R. Stata. Extended static checking for Java. In PLDI, 2002.
[14]
D. Geels, G. Altekar, S. Shenker, and I. Stoica. Replay debugging for distributed applications. In USENIX ATEC, 2006.
[15]
C. Gniady, B. Falsafi, and T. N. Vijaykumar. Is SC + ILP = RC? In ISCA, 1999.
[16]
M. L. Goodstein, E. Vlachos, S. Chen, P. B. Gibbons, M. Kozuch, and T. C. Mowry. Butterfly analysis: Adapting dataflow analysis to dynamic parallel monitoring. In ASPLOS, 2010.
[17]
M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In HPCA, 1993.
[18]
D. R. Hower and M. D. Hill. Rerun: Exploiting episodes for lightweight memory race recording. In ISCA, 2008.
[19]
HP Labs. Cacti 5.1 Technical Report. http://www.hpl.hp.com/research/cacti/.
[20]
H. Kannan. Ordering decoupled metadata accesses in multiprocessors. In MICRO, 2009.
[21]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
[22]
P. Montesinos, L. Ceze, and J. Torrellas. DeLorean: Recording and deterministically replaying shared-memory multiprocessor execution efficiently. In ISCA, 2008.
[23]
P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: A software-hardware interface for practical deterministic multiprocessor replay. In ASPLOS, 2009.
[24]
S. S. Mukherjee, B. Falsafi, M. D. Hill, and D. A. Wood. Coherent network interfaces for fine-grain communication. In ISCA, 1996.
[25]
V. Nagarajan and R. Gupta. Architectural support for shadow memory in multiprocessors. In VEE, 2009.
[26]
S. Narayanasamy, C. Pereira, and B. Calder. Recording shared memory dependencies using strata. In ASPLOS, 2006.
[27]
S. Narayanasamy, G. Pokam, and B. Calder. BugNet: Continuously recording program execution for deterministic replay debugging. In ISCA, 2005.
[28]
N. Nethercote. Dynamic Binary Analysis and Instrumentation. PhD thesis, U. Cambridge, 2004. http://valgrind.org.
[29]
N. Nethercote and J. Seward. Valgrind: A program supervision framework. Electronic Notes in Theoretical Computer Science, 89(2), 2003.
[30]
N. Nethercote and J. Seward. How to shadow every byte of memory used by a program. In VEE, 2007.
[31]
N. Nethercote and J. Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In PLDI, 2007.
[32]
J. Newsome and D. Song. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In NDSS, 2005.
[33]
E. B. Nightingale, D. Peek, P. M. Chen, and J. Flinn. Parallelizing security checks on commodity hardware. In ASPLOS, 2008.
[34]
G. Pokam, C. Pereira, K. Danne, R. Kassa, and A.-R. Adl-Tabatabai. Architecting a chunk--based memory race recorder in modern CMPs. In MICRO, 2009.
[35]
F. Qin, C.Wang, Z. Li, H. Kim, Y. Zhou, and Y. Wu. LIFT: A low-overhead practical information flow tracking system for detecting security attacks. In MICRO, 2006.
[36]
O. Ruwase, P. B. Gibbons, T. C. Mowry, V. Ramachandran, S. Chen, M. Kozuch, and M. Ryan. Parallelizing Dynamic Information Flow Tracking. In SPAA, 2008.
[37]
S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A dynamic race detector for multi-threaded programs. ACM TOCS, 15(4), 1997.
[38]
R. Shetty, M. Kharbutli, Y. Solihin, and M. Prvulovic. Heapmon: A helper-thread approach to programmable, automatic, and lowoverhead memory bug detection. IBM J. on Research and Development, 50(2/3), 2006.
[39]
G. E. Suh, J. W. Lee, D. Zhang, and S. Devadas. Secure program execution via dynamic information flow tracking. In ASPLOS, 2004.
[40]
G.-R. Uh, R. Cohn, B. Yadavalli, R. Peri, and R. Ayyagari. Analyzing dynamic binary instrumentation overhead. In WBIA Workshop at ASPLOS, 2006.
[41]
G. Venkataramani, I. Doudalis, Y. Solihin, and M. Prvulovic. Flexi-Taint: A programmable accelerator for dynamic taint propagation. In HPCA, 2008.
[42]
G. Venkataramani, B. Roemer, Y. Solihin, and M. Prvulovic. Mem-Tracker: Efficient and programmable support for memory access monitoring and debugging. In HPCA, 2007.
[43]
Virtutech Simics. http://www.virtutech.com/.
[44]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In ISCA, 1995.
[45]
M. Xu, R. Bodik, and M. D. Hill. A 'Flight Data Recorder' for enabling full-system multiprocessor deterministic replay. In ISCA, 2003.
[46]
M. Xu, R. Bodik, and M. D. Hill. A regulated transitive reduction (RTR) for longer memory race recording. In ASPLOS, 2006.
[47]
P. Zhou, R. Teodorescu, and Y. Zhou. HARD: Hardware-assisted lockset-based race detection. In HPCA, 2007.
[48]
Y. Zhou, P. Zhou, F. Qin,W. Liu, and J. Torrellas. Efficient and flexible architectural support for dynamic monitoring. ACM TACO, 2(1), 2005.

Cited By

View all
  • (2019)DHOOMProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317799(1-6)Online publication date: 2-Jun-2019
  • (2018)Neural Vector Spaces for Unsupervised Information RetrievalACM Transactions on Information Systems10.1145/319682636:4(1-25)Online publication date: 26-Jun-2018
  • (2017)Obtaining and Managing Answer Quality for Online Data-Intensive ServicesACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/30552802:2(1-31)Online publication date: 26-Apr-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 38, Issue 1
ASPLOS '10
March 2010
399 pages
ISSN:0163-5964
DOI:10.1145/1735970
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
    March 2010
    422 pages
    ISBN:9781605588391
    DOI:10.1145/1736020
    • General Chair:
    • James C. Hoe,
    • Program Chair:
    • Vikram S. Adve
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2010
Published in SIGARCH Volume 38, Issue 1

Check for updates

Author Tags

  1. hardware support for debugging
  2. instruction-grain lifeguards
  3. online parallel monitoring

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)DHOOMProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317799(1-6)Online publication date: 2-Jun-2019
  • (2018)Neural Vector Spaces for Unsupervised Information RetrievalACM Transactions on Information Systems10.1145/319682636:4(1-25)Online publication date: 26-Jun-2018
  • (2017)Obtaining and Managing Answer Quality for Online Data-Intensive ServicesACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/30552802:2(1-31)Online publication date: 26-Apr-2017
  • (2014)Approaches and Challenges in Database Intrusion DetectionACM SIGMOD Record10.1145/2694428.269443543:3(36-47)Online publication date: 4-Dec-2014
  • (2014)Towards Total Traffic AwarenessACM SIGMOD Record10.1145/2694428.269443243:3(18-23)Online publication date: 4-Dec-2014
  • (2014)Finding trojan message vulnerabilities in distributed systemsACM SIGARCH Computer Architecture News10.1145/2654822.254198442:1(113-126)Online publication date: 24-Feb-2014
  • (2014)High-performance fractal coherenceACM SIGARCH Computer Architecture News10.1145/2654822.254198242:1(701-714)Online publication date: 24-Feb-2014
  • (2014)RelaxReplayACM SIGARCH Computer Architecture News10.1145/2654822.254197942:1(223-238)Online publication date: 24-Feb-2014
  • (2014)Speculative hardware/software co-designed floating-point multiply-add fusionACM SIGARCH Computer Architecture News10.1145/2654822.254197842:1(623-638)Online publication date: 24-Feb-2014
  • (2014)Locality-oblivious cache organization leveraging single-cycle multi-hop NoCsACM SIGARCH Computer Architecture News10.1145/2654822.254197642:1(715-728)Online publication date: 24-Feb-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media