[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Slipstream processors: improving both performance and fault tolerance

Published: 01 November 2000 Publication History

Abstract

Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating a shorter but otherwise equivalent version of the original program by removing ineffectual computation and computation related to highly-predictable control flow. The shortened program is run concurrently with the full program on a chip multiprocessor or simultaneous multithreaded processor, with two key advantages:1) Improved single-program performance. The shorter program speculatively runs ahead of the full program and supplies the full program with control and data flow outcomes. The full program executes efficiently due to the communicated outcomes, at the same time validating the speculative, shorter program. The two programs combined run faster than the original program alone. Detailed simulations of an example implementation show an average improvement of 7% for the SPEC95 integer benchmarks.2) Fault tolerance. The shorter program is a subset of the full program and this partial-redundancy is transparently leveraged for detecting and recovering from transient hardware faults.

References

[1]
H. Akkary and M. Driscoll. A Dynamic Multithreading Processor. 31st Int'l Symp. on Microarchitecture, Dec. 1998.
[2]
T. Austin. DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design. 32nd Int'l Syrup. on Microarchitecture, Nov. 1999.
[3]
D. Burger, T. Austin, and S. Bennett. Evaluating Future Microprocessors: The Simplescalar Toolset. Technical Report CS-TR-96-1308, Computer Sciences Department, University of Wisconsin - Madison, July 1996.
[4]
D. Burger, S. Kaxiras, and J. Goodman. DataScalar Architectures. 24th Int'l Symp. on Computer Architecture, June 1997.
[5]
R. ChappeU, J. Stark, S. Kim, S. Reinhardt, and Y. Patt. Simultaneous Subordinate Microthreading (SSMT), 26th Int'l Symp. on Computer Architecture, May 1999.
[6]
D. Cormors and W.-M. Hwu. Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results. 32nd Int'l Syrup. on Microarchitecture, Nov. 1999.
[7]
P. Dubey, K. O'Brien, K. M. O'Brien, and C. Barton. Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading. Parallel Architectures and Compiler Techniques, June 1995
[8]
A. Farcy, O. Temam, R. Espasa, and T. Juan. Dataflow Analysis of Branch Mispredictions and its Application to Early Resolution of Branch Outcomes. 31st Int'l Syrup. on Microarchitecture, Dec. 1998.
[9]
A. Gonz(dez, J. Tubella, and C. Molina. Trace-Level Reuse. lnt'l Conf. on Parallel Processing, Sep. 1999.
[10]
J. Huang and D. Lilja. Exploiting Basic Block Value Locality with Block Reuse. 5th lnt'l Syrup. on High-Performance Computer Architecture, Jan. 1999.
[11]
R. Iyer, A. Avizienis, D. Barron, D. Powell, H. Levendel, and J. Samson. Panel: Using COTS to Design Dependable Networked Systems. 29th Int'l Syrup. on Fault-Tolerant Computing, June 1999.
[12]
E. Jacobsen, E. Rotenberg, and J. Smith. Assigning Confidence to Conditional Branch Predictions. 29th lnt'l Syrup. on Microarchitecture, Dec. 1996.
[13]
Q. Jacobson, E. Rotenberg, and J. Smith. Path-Based Next Trace Prediction. 30th lnt'l Symp. on Microarchitecture, Dec. 1997.
[14]
S. Jourdan, R. Ronen, M. Bekerman, B. Shomar, and A. Yoaz. A Novel Renaming Scheme to Exploit Value Temporal Locality through Physical Register Reuse and Unification. 31st lnt'l Symp. on Microarchitecture, Nov. 1998.
[15]
K. Lepak and M. Lipasti. On the Value Locality of Store Instructions. 27th Int'l Syrup. on Computer Architecture, June 2000.
[16]
M. Lipasti, C. Wilkerson, and J. Shen. Value Locality and Load Value Prediction. 7th lnt'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1996.
[17]
M. Lipasti. Value Locality and Speculative Execution. Ph.D. Thesis, Carnegie Mellon University, April 1997.
[18]
M. Martin, A. Roth, and C. Fischer. Exploiting Dead Value Information. 30th Int'l Syrup. on Microarchitecture, Dec. 1997.
[19]
C. Molina, A. Gonzalez, and J. Tubella. Reducing Memory Traffic via Redundant Store Instructions. HPCN Europe, 1999.
[20]
K. Olukotun, B. Nayfeh, L. Hammond, K. Wilson, and K.-Y. Chang. The Case for a Single-Chip Multiprocessor. 7th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1996.
[21]
J. Oplinger, D. Heine, S.-W. Liao, B. Nayfeh, M. Lam, and K. Olukotun. Software and Hardware for Exploiting Speculative Parallelism in Multiprocessors. CSL-TR-97-715, Stanford University, Feb. 1997.
[22]
S. Reinhardt and S. Mukherjee. Transient Fault Detection via Simultaneous Multithreading. 27th Int'l Symp. on Computer Architecture, June 2000.
[23]
D. Ronfeldt. Social Science at 190 MPH on NASCAR's Biggest Superspeedways. First Monday Journal (on-line), Vol. 5 No. 2, Feb. 7, 2000.
[24]
E. Rotenberg. AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors. 29th Int'l Symp. on Fault-Tolerant Computing, June 1999.
[25]
E. Rotenberg. Exploiting Large Ineffectual Instruction Sequences. Technical Report, Department of Electrical and Computer Engineering, North Carolina State University, Nov. 1999.
[26]
A. Roth, A. Moshovos, and G. Sohi. Dependence Based Prefetching for Linked Data Structures. 8th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1998.
[27]
A. Roth and G. Sohi. Speculative Data Driven Sequencing for Imperative Programs. Technical Report CS-TR-2000-1411, Computer Sciences Department, University of Wisconsin - Madison, Feb. 2000.
[28]
A. Roth and G. Sohi. Speculative Data-Driven Multithreading. Technical Report CS-TR-2000-1414, Computer Sciences Department, University of Wisconsin - Madison, April 2000.
[29]
P. Rubinfeld. Virtual Roundtable on the Challenges and Trends in Processor Design: Managing Problems at High Speeds. Computer, 31(1):47-48, Jan. 1998.
[30]
Y. Sazeides and J. E. Smith. Modeling Program Predictability. 25th lnt'l Syrup. on Computer Architecture, June 1998.
[31]
A. Sodani and G. S. Sohi. Dynamic Instruction Reuse. 24th Int'l Symp. on Computer Architecture, June 1997.
[32]
A. Sodani and G. S. Sohi. An Empirical Analysis of Instruction Repetition. 8th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 1998.
[33]
G. Sohi, S. Breach, and T. N. Vijaykumar. Multiscalar Processors. 22nd Int'l Symp. on Computer Architecture, June 1995.
[34]
J. Steffan and T. Mowry. The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization. 4th lnt'l Symp. on High-Performance Computer Architecture, Feb. 1998.
[35]
J.-Y. Tsai and P.-C. Yew. The Superthreaded Architecture: Thread Pipelining with Run-time Data Dependence Checking and Control Speculation. Parallel Architectures and Compiler Techniques, 1996.
[36]
D. Tullsen, S. Eggers, and H. Levy. Simultaneous Multithreading: Maximizing On-Chip Parallelism. 22nd Int'l Symp. on Computer Architecture, June 1995.
[37]
D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stature. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor. 23rd Int'l Symp. on Computer Architecture, May 1996.
[38]
D. Tullsen and J. Seng. Storageless Value Prediction Using Prior Register Values. 26th Int'l Symp. on Computer Architecture, May 1999.
[39]
W. Yamamoto and M. Nemirovsky. Increasing Superscalar Performance through Multistreaming. Parallel Architectures and Compilation Techniques, June 1995.
[40]
C. Zilles, J. Emer, and G. Sohi. The Use of Multithreading for Exception Handling. 32nd Int'l Symp. on Microarchitecture, Nov. 1999.
[41]
C. Zilles and G. Sohi. Understanding the Backward Slices of Performance Degrading Instructions. 27th lnt'l Symp. on Computer Architecture, June 2000.

Cited By

View all
  • (2024)Survey on Redundancy Based-Fault tolerance methods for Processors and Hardware accelerators - Trends in Quantum Computing, Heterogeneous Systems and ReliabilityACM Computing Surveys10.1145/366367256:11(1-76)Online publication date: 28-Jun-2024
  • (2024)Timely, Efficient, and Accurate Branch Precomputation2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00043(480-492)Online publication date: 2-Nov-2024
  • (2023)Design of Low-Cost Reliable and Fault-Tolerant 32-Bit One Instruction Core for Multi-Core SystemsQuality Control - An Anthology of Cases10.5772/intechopen.102823Online publication date: 18-Jan-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 35, Issue 11
Nov. 2000
269 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/356989
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2000
Published in SIGPLAN Volume 35, Issue 11

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)119
  • Downloads (Last 6 weeks)27
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Survey on Redundancy Based-Fault tolerance methods for Processors and Hardware accelerators - Trends in Quantum Computing, Heterogeneous Systems and ReliabilityACM Computing Surveys10.1145/366367256:11(1-76)Online publication date: 28-Jun-2024
  • (2024)Timely, Efficient, and Accurate Branch Precomputation2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00043(480-492)Online publication date: 2-Nov-2024
  • (2023)Design of Low-Cost Reliable and Fault-Tolerant 32-Bit One Instruction Core for Multi-Core SystemsQuality Control - An Anthology of Cases10.5772/intechopen.102823Online publication date: 18-Jan-2023
  • (2023)By-Software Branch Prediction in LoopsIEEE Computer Architecture Letters10.1109/LCA.2023.330461322:2(129-132)Online publication date: 1-Jul-2023
  • (2022)Evaluation of Dynamic Triple Modular Redundancy in an Interleaved-Multi-Threading RISC-V CoreJournal of Low Power Electronics and Applications10.3390/jlpea1301000213:1(2)Online publication date: 28-Dec-2022
  • (2022)Dataflow Driven Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless SystemsACM Transactions on Embedded Computing Systems10.1145/352013521:5(1-29)Online publication date: 9-Dec-2022
  • (2022)Tiny but mightyProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527400(817-830)Online publication date: 18-Jun-2022
  • (2022)One Shot System Based Reliability Modelling And Analysis for Low-Cost Fault-Tolerant Computing System Comprising of One Instruction Cores2022 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON)10.1109/SMARTGENCON56628.2022.10084187(1-9)Online publication date: 23-Dec-2022
  • (2022)Design and Evaluation of Buffered Triple Modular Redundancy in Interleaved-Multi-Threading ProcessorsIEEE Access10.1109/ACCESS.2022.322597510(126074-126088)Online publication date: 2022
  • (2022)Error DetectionFault Tolerant Computer Architecture10.1007/978-3-031-01723-0_2(19-59)Online publication date: 5-Mar-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media