More Web Proxy on the site http://driver.im/

article

Free access

Slipstream processors: improving both performance and fault tolerance

Authors:

Karthik Sundaramoorthy,

Eric RotenbergAuthors Info & Claims

ACM SIGPLAN Notices, Volume 35, Issue 11

Pages 257 - 268

https://doi.org/10.1145/356989.357013

Published: 01 November 2000 Publication History

Abstract

Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating a shorter but otherwise equivalent version of the original program by removing ineffectual computation and computation related to highly-predictable control flow. The shortened program is run concurrently with the full program on a chip multiprocessor or simultaneous multithreaded processor, with two key advantages:1) Improved single-program performance. The shorter program speculatively runs ahead of the full program and supplies the full program with control and data flow outcomes. The full program executes efficiently due to the communicated outcomes, at the same time validating the speculative, shorter program. The two programs combined run faster than the original program alone. Detailed simulations of an example implementation show an average improvement of 7% for the SPEC95 integer benchmarks.2) Fault tolerance. The shorter program is a subset of the full program and this partial-redundancy is transparently leveraged for detecting and recovering from transient hardware faults.

References

[1]

H. Akkary and M. Driscoll. A Dynamic Multithreading Processor. 31st Int'l Symp. on Microarchitecture, Dec. 1998.

Digital Library

[2]

T. Austin. DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design. 32nd Int'l Syrup. on Microarchitecture, Nov. 1999.

Digital Library

[3]

D. Burger, T. Austin, and S. Bennett. Evaluating Future Microprocessors: The Simplescalar Toolset. Technical Report CS-TR-96-1308, Computer Sciences Department, University of Wisconsin - Madison, July 1996.

[4]

D. Burger, S. Kaxiras, and J. Goodman. DataScalar Architectures. 24th Int'l Symp. on Computer Architecture, June 1997.

Digital Library

[5]

R. ChappeU, J. Stark, S. Kim, S. Reinhardt, and Y. Patt. Simultaneous Subordinate Microthreading (SSMT), 26th Int'l Symp. on Computer Architecture, May 1999.

Digital Library

[6]

D. Cormors and W.-M. Hwu. Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results. 32nd Int'l Syrup. on Microarchitecture, Nov. 1999.

Digital Library

[7]

P. Dubey, K. O'Brien, K. M. O'Brien, and C. Barton. Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading. Parallel Architectures and Compiler Techniques, June 1995

Digital Library

[8]

A. Farcy, O. Temam, R. Espasa, and T. Juan. Dataflow Analysis of Branch Mispredictions and its Application to Early Resolution of Branch Outcomes. 31st Int'l Syrup. on Microarchitecture, Dec. 1998.

Digital Library

[9]

A. Gonz(dez, J. Tubella, and C. Molina. Trace-Level Reuse. lnt'l Conf. on Parallel Processing, Sep. 1999.

Digital Library

[10]

J. Huang and D. Lilja. Exploiting Basic Block Value Locality with Block Reuse. 5th lnt'l Syrup. on High-Performance Computer Architecture, Jan. 1999.

Digital Library

[11]

R. Iyer, A. Avizienis, D. Barron, D. Powell, H. Levendel, and J. Samson. Panel: Using COTS to Design Dependable Networked Systems. 29th Int'l Syrup. on Fault-Tolerant Computing, June 1999.

[12]

E. Jacobsen, E. Rotenberg, and J. Smith. Assigning Confidence to Conditional Branch Predictions. 29th lnt'l Syrup. on Microarchitecture, Dec. 1996.

Digital Library

[13]

Q. Jacobson, E. Rotenberg, and J. Smith. Path-Based Next Trace Prediction. 30th lnt'l Symp. on Microarchitecture, Dec. 1997.

Digital Library

[14]

S. Jourdan, R. Ronen, M. Bekerman, B. Shomar, and A. Yoaz. A Novel Renaming Scheme to Exploit Value Temporal Locality through Physical Register Reuse and Unification. 31st lnt'l Symp. on Microarchitecture, Nov. 1998.

Digital Library

[15]

K. Lepak and M. Lipasti. On the Value Locality of Store Instructions. 27th Int'l Syrup. on Computer Architecture, June 2000.

Digital Library

[16]

M. Lipasti, C. Wilkerson, and J. Shen. Value Locality and Load Value Prediction. 7th lnt'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1996.

Digital Library

[17]

M. Lipasti. Value Locality and Speculative Execution. Ph.D. Thesis, Carnegie Mellon University, April 1997.

Digital Library

[18]

M. Martin, A. Roth, and C. Fischer. Exploiting Dead Value Information. 30th Int'l Syrup. on Microarchitecture, Dec. 1997.

Digital Library

[19]

C. Molina, A. Gonzalez, and J. Tubella. Reducing Memory Traffic via Redundant Store Instructions. HPCN Europe, 1999.

Digital Library

[20]

K. Olukotun, B. Nayfeh, L. Hammond, K. Wilson, and K.-Y. Chang. The Case for a Single-Chip Multiprocessor. 7th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1996.

Digital Library

[21]

J. Oplinger, D. Heine, S.-W. Liao, B. Nayfeh, M. Lam, and K. Olukotun. Software and Hardware for Exploiting Speculative Parallelism in Multiprocessors. CSL-TR-97-715, Stanford University, Feb. 1997.

Digital Library

[22]

S. Reinhardt and S. Mukherjee. Transient Fault Detection via Simultaneous Multithreading. 27th Int'l Symp. on Computer Architecture, June 2000.

Digital Library

[23]

D. Ronfeldt. Social Science at 190 MPH on NASCAR's Biggest Superspeedways. First Monday Journal (on-line), Vol. 5 No. 2, Feb. 7, 2000.

[24]

E. Rotenberg. AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors. 29th Int'l Symp. on Fault-Tolerant Computing, June 1999.

Digital Library

[25]

E. Rotenberg. Exploiting Large Ineffectual Instruction Sequences. Technical Report, Department of Electrical and Computer Engineering, North Carolina State University, Nov. 1999.

[26]

A. Roth, A. Moshovos, and G. Sohi. Dependence Based Prefetching for Linked Data Structures. 8th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1998.

Digital Library

[27]

A. Roth and G. Sohi. Speculative Data Driven Sequencing for Imperative Programs. Technical Report CS-TR-2000-1411, Computer Sciences Department, University of Wisconsin - Madison, Feb. 2000.

[28]

A. Roth and G. Sohi. Speculative Data-Driven Multithreading. Technical Report CS-TR-2000-1414, Computer Sciences Department, University of Wisconsin - Madison, April 2000.

[29]

P. Rubinfeld. Virtual Roundtable on the Challenges and Trends in Processor Design: Managing Problems at High Speeds. Computer, 31(1):47-48, Jan. 1998.

Digital Library

[30]

Y. Sazeides and J. E. Smith. Modeling Program Predictability. 25th lnt'l Syrup. on Computer Architecture, June 1998.

Digital Library

[31]

A. Sodani and G. S. Sohi. Dynamic Instruction Reuse. 24th Int'l Symp. on Computer Architecture, June 1997.

Digital Library

[32]

A. Sodani and G. S. Sohi. An Empirical Analysis of Instruction Repetition. 8th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1998.

Digital Library

[33]

G. Sohi, S. Breach, and T. N. Vijaykumar. Multiscalar Processors. 22nd Int'l Symp. on Computer Architecture, June 1995.

Digital Library

[34]

J. Steffan and T. Mowry. The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization. 4th lnt'l Symp. on High-Performance Computer Architecture, Feb. 1998.

Digital Library

[35]

J.-Y. Tsai and P.-C. Yew. The Superthreaded Architecture: Thread Pipelining with Run-time Data Dependence Checking and Control Speculation. Parallel Architectures and Compiler Techniques, 1996.

Digital Library

[36]

D. Tullsen, S. Eggers, and H. Levy. Simultaneous Multithreading: Maximizing On-Chip Parallelism. 22nd Int'l Symp. on Computer Architecture, June 1995.

Digital Library

[37]

D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stature. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor. 23rd Int'l Symp. on Computer Architecture, May 1996.

Digital Library

[38]

D. Tullsen and J. Seng. Storageless Value Prediction Using Prior Register Values. 26th Int'l Symp. on Computer Architecture, May 1999.

Digital Library

[39]

W. Yamamoto and M. Nemirovsky. Increasing Superscalar Performance through Multistreaming. Parallel Architectures and Compilation Techniques, June 1995.

Digital Library

[40]

C. Zilles, J. Emer, and G. Sohi. The Use of Multithreading for Exception Handling. 32nd Int'l Symp. on Microarchitecture, Nov. 1999.

Digital Library

[41]

C. Zilles and G. Sohi. Understanding the Backward Slices of Performance Degrading Instructions. 27th lnt'l Symp. on Computer Architecture, June 2000.

Digital Library

Cited By

Venkatesha SParthasarathi R(2024)Survey on Redundancy Based-Fault tolerance methods for Processors and Hardware accelerators - Trends in Quantum Computing, Heterogeneous Systems and ReliabilityACM Computing Surveys10.1145/366367256:11(1-76)Online publication date: 28-Jun-2024
https://dl.acm.org/doi/10.1145/3663672
Deshmukh ACai LPatt Y(2024)Timely, Efficient, and Accurate Branch Precomputation2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00043(480-492)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00043
Venkatesha SParthasarathi R(2023)Design of Low-Cost Reliable and Fault-Tolerant 32-Bit One Instruction Core for Multi-Core SystemsQuality Control - An Anthology of Cases10.5772/intechopen.102823Online publication date: 18-Jan-2023
https://doi.org/10.5772/intechopen.102823
Show More Cited By

Index Terms

Slipstream processors: improving both performance and fault tolerance

Recommendations

Slipstream processors: improving both performance and fault tolerance
Special Issue: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems (ASPLOS '00)

Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating a shorter but otherwise equivalent version of the ...
Slipstream processors: improving both performance and fault tolerance
ASPLOS IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems

Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating a shorter but otherwise equivalent version of the ...
Slipstream processors: improving both performance and fault tolerance

Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating a shorter but otherwise equivalent version of the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 35, Issue 11

Nov. 2000

269 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/356989

Editors:
Cindy Norris
Appalachian State Univ., Boone, NC
,
James B. Fenwick
Appalachian State Univ., Boone, NC

Issue’s Table of Contents

Copyright © 2000 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2000

Published in SIGPLAN Volume 35, Issue 11

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

45
Total Citations
View Citations
676
Total Downloads

Downloads (Last 12 months)119
Downloads (Last 6 weeks)27

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Venkatesha SParthasarathi R(2024)Survey on Redundancy Based-Fault tolerance methods for Processors and Hardware accelerators - Trends in Quantum Computing, Heterogeneous Systems and ReliabilityACM Computing Surveys10.1145/366367256:11(1-76)Online publication date: 28-Jun-2024
https://dl.acm.org/doi/10.1145/3663672
Deshmukh ACai LPatt Y(2024)Timely, Efficient, and Accurate Branch Precomputation2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00043(480-492)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00043
Venkatesha SParthasarathi R(2023)Design of Low-Cost Reliable and Fault-Tolerant 32-Bit One Instruction Core for Multi-Core SystemsQuality Control - An Anthology of Cases10.5772/intechopen.102823Online publication date: 18-Jan-2023
https://doi.org/10.5772/intechopen.102823
Goudarzi MAzimi RHumecki JRehman FZhang RSethi CBomman TYang Y(2023)By-Software Branch Prediction in LoopsIEEE Computer Architecture Letters10.1109/LCA.2023.330461322:2(129-132)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1109/LCA.2023.3304613
Barbirotta MCheikh AMastrandrea AMenichelli FOttavi MOlivieri M(2022)Evaluation of Dynamic Triple Modular Redundancy in an Interleaved-Multi-Threading RISC-V CoreJournal of Low Power Electronics and Applications10.3390/jlpea1301000213:1(2)Online publication date: 28-Dec-2022
https://doi.org/10.3390/jlpea13010002
Gomez ATretter AHager PSanmugarajah PBenini LThiele L(2022)Dataflow Driven Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless SystemsACM Transactions on Embedded Computing Systems10.1145/352013521:5(1-29)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3520135
Orenes-Vera MManocha ABalkind JGao FAragón JWentzlaff DMartonosi MSalapura VZahran MChong FTang L(2022)Tiny but mightyProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527400(817-830)Online publication date: 18-Jun-2022
https://dl.acm.org/doi/10.1145/3470496.3527400
Venkatesha SParthasarathi R(2022)One Shot System Based Reliability Modelling And Analysis for Low-Cost Fault-Tolerant Computing System Comprising of One Instruction Cores2022 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON)10.1109/SMARTGENCON56628.2022.10084187(1-9)Online publication date: 23-Dec-2022
https://doi.org/10.1109/SMARTGENCON56628.2022.10084187
Barbirotta MCheikh AMastrandrea AMenichelli FOlivieri M(2022)Design and Evaluation of Buffered Triple Modular Redundancy in Interleaved-Multi-Threading ProcessorsIEEE Access10.1109/ACCESS.2022.322597510(126074-126088)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3225975
Sorin DSorin D(2022)Error DetectionFault Tolerant Computer Architecture10.1007/978-3-031-01723-0_2(19-59)Online publication date: 5-Mar-2022
https://doi.org/10.1007/978-3-031-01723-0_2
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents