More Web Proxy on the site http://driver.im/

article

Free access

Limits of control flow on parallelism

Authors:

Robert P. WilsonAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 20, Issue 2

Pages 46 - 57

https://doi.org/10.1145/146628.139702

Published: 01 April 1992 Publication History

Abstract

This paper discusses three techniques useful in relaxing the constraints imposed by control flow on parallelism: control dependence analysis, executing multiple flows of control simultaneously, and speculative execution. We evaluate these techniques by using trace simulations to find the limits of parallelism for machines that employ different combinations of these techniques. We have three major results. First, local regions of code have limited parallelism, and control dependence analysis is useful in extracting global parallelism from different parts of a program. Second, a superscalar processor is fundamentally limited because it cannot execute independent regions of code concurrently. Higher performance can be obtained with machines, such as multiprocessors and dataflow machines, that can simultaneously follow multiple flows of control. Finally, without speculative execution to allow instructions to execute before their control dependences are resolved, only modest amounts of parallelism can be obtained for programs with complex control flow.

References

[1]

D. Bernstein and M. Rodeh. Global Instruction Scheduling for Superscalar Machines. In Proceedings of the ACM SIGPLAN'91 Conference on Programming Language Design and Implementation, pages 241-255, June 1991.

Digital Library

[2]

R. P. Colwell, R. P. Nix, j. J. O'Donnell, D. B. Papworth, and P. K. Rodman. A VLIW Architecture for a Trace Scheduling Compiler. IEEE Transactions on Computers, C-37(8):967-979, August 1988. Also in ASPLOS-3 pp. 180-192.

Digital Library

[3]

R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. An Efficient Method of Computing Static Single Assignment Form. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages, pages 25-35, January 1989.

Digital Library

[4]

J. A. Fisher. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Transactions on Computers, C-30(7):478-490, July 1981.

Digital Library

[5]

P. Y. Hsu. Highly Concurrent Scalar Processing. PhD thesis, University of Illinois at Urbana-Champaign, 1986.

Digital Library

[6]

W. W. Hwu, T. M. Conte, and P. P. Chang. Comparing Software and Hardware Schemes For Reducing the Cost of Branches. In Proceedings of the 16th Annual International Symposium on Computer Architecture, pages 224-233, May 1989.

Digital Library

[7]

M. Johnson. Superscalar Microprocessor Design. Prentice Hall, Englewood Cliffs, NJ, 1990.

[8]

S. McFarling and J. Hennessy. Reducing the Cost of Branches. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 396--404, June 1986.

Digital Library

[9]

K. Murakami, N. lrie, M. Kuga, and S. Tomitao SIMP (Single Instruction Stream/Multiple Instruction Pipelining): A Novel High-Speed Single-Processor Architecture. in Proceedings of the 16th Annual International Symposium on Computer Architecture, pages 78-85, May 1989.

Digital Library

[10]

A. Nicolau and J. A. Fisher. Measuring the Parallelism Available for Very Long Instruction Word Architectures. IEEE Transactions on Computers, C- 33(11):968-976, November 1984.

[11]

Y. N. Patt, S. W. Melvin, W. W. Hwu, and M. Shebanow. Critical Issues Regarding HPS, A High Performance Microarchitecture, In Proceedings of the 18th Annual Workshop on Microprogramming, pages 109- 116, December 1985.

Digital Library

[12]

E. M. Riseman and C. C. Foster. The Inhibition of Potential Parallelism by Conditional Jumps. IEEE Transactions on Computers, C-21(12):1405-1411, December 1972.

[13]

J. E. Smith and A. R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36-44, June 1985.

Digital Library

[14]

M. D. Smith, M. johnson, and M. A. Horowitz. Limits on Multiple Instruction Issue. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 290-302, April 1989.

Digital Library

[15]

M. D. Smith, M. S. Lam, and M. A. Horowitz. Boosting Beyond Static Scheduling in a Superscalar Processor. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 344-354, May 1990.

Digital Library

[16]

G. S. Sohi and S. Vajapeyam. Instruction Issue Logic for High-Performance, Interruptible Pipelined Processors. In Proceedings of the 14th Annual international Symposium on Computer Architecture, pages 27-34, June 1987.

Digital Library

[17]

D. W. Wall. Limits of Instruction-Level Parallelism. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 176-188, April 1991.

Digital Library

Cited By

Edwards JVishkin U(2021)Study of Fine-grained Nested Parallelism in CDCL SAT SolversACM Transactions on Parallel Computing10.1145/34706398:3(1-18)Online publication date: 20-Sep-2021
https://dl.acm.org/doi/10.1145/3470639
Shao YBrooks D(2015)Research Infrastructures for Hardware AcceleratorsSynthesis Lectures on Computer Architecture10.2200/S00677ED1V01Y201511CAC03410:4(1-99)Online publication date: 18-Nov-2015
https://doi.org/10.2200/S00677ED1V01Y201511CAC034
Goossens BParello DPorada KRahmoune D(2015)Toward a Core Design to Distribute an Execution on a Manycore ProcessorProceedings of the 13th International Conference on Parallel Computing Technologies - Volume 925110.1007/978-3-319-21909-7_38(390-404)Online publication date: 31-Aug-2015
https://dl.acm.org/doi/10.1007/978-3-319-21909-7_38
Show More Cited By

Index Terms

Limits of control flow on parallelism
1. Computer systems organization
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Control structures

Recommendations

Limits of control flow on parallelism
ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture

This paper discusses three techniques useful in relaxing the constraints imposed by control flow on parallelism: control dependence analysis, executing multiple flows of control simultaneously, and speculative execution. We evaluate these techniques by ...
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruction-level parallelism (ILP) and thread-level parallelism (TLP). Wide-issue super-scalar processors exploit ILP by executing multiple instructions from a ...
Exploiting Instruction-Level Parallelism for Integrated Control-Flow Monitoring

Computer architectures are using increased degrees of instruction-level machine parallelism to achieve higher performance, e.g., superpipelined, superscalar and very long instruction word (VLIW) processors. Full utilization of such machine parallelism ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 20, Issue 2

Special Issue: Proceedings of the 19th annual international symposium on Computer architecture (ISCA '92)

May 1992

429 pages

ISSN:0163-5964

DOI:10.1145/146628

Editor:
Allan Gotlieb
New York Univ., New York, NY

Issue’s Table of Contents

ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture
May 1992
439 pages
ISBN:0897915097
DOI:10.1145/139669
Chairman:
Allan Gottlieb
New York Unvi., New York, NY

Copyright © 1992 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1992

Published in SIGARCH Volume 20, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

232
Total Citations
View Citations
1,823
Total Downloads

Downloads (Last 12 months)155
Downloads (Last 6 weeks)9

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Edwards JVishkin U(2021)Study of Fine-grained Nested Parallelism in CDCL SAT SolversACM Transactions on Parallel Computing10.1145/34706398:3(1-18)Online publication date: 20-Sep-2021
https://dl.acm.org/doi/10.1145/3470639
Shao YBrooks D(2015)Research Infrastructures for Hardware AcceleratorsSynthesis Lectures on Computer Architecture10.2200/S00677ED1V01Y201511CAC03410:4(1-99)Online publication date: 18-Nov-2015
https://doi.org/10.2200/S00677ED1V01Y201511CAC034
Goossens BParello DPorada KRahmoune D(2015)Toward a Core Design to Distribute an Execution on a Manycore ProcessorProceedings of the 13th International Conference on Parallel Computing Technologies - Volume 925110.1007/978-3-319-21909-7_38(390-404)Online publication date: 31-Aug-2015
https://dl.acm.org/doi/10.1007/978-3-319-21909-7_38
Dévai RJász JNagy CFerenc R(2014)Designing and implementing control flow graph for magic 4th generation languageActa Cybernetica10.14232/actacyb.21.3.2014.921:3(419-437)Online publication date: 1-Aug-2014
https://dl.acm.org/doi/10.14232/actacyb.21.3.2014.9
Goossens BParello D(2013)Limits of Instruction-Level Parallelism CaptureProcedia Computer Science10.1016/j.procs.2013.05.33418(1664-1673)Online publication date: 2013
https://doi.org/10.1016/j.procs.2013.05.334
Anjo ICachopo J(2013)A Software-Based Method-Level Speculation Framework for the Java PlatformLanguages and Compilers for Parallel Computing10.1007/978-3-642-37658-0_14(205-219)Online publication date: 2013
https://doi.org/10.1007/978-3-642-37658-0_14
Anjo ICachopo J(2013)Improving Continuation-Powered Method-Level Speculation for JVM ApplicationsAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-03859-9_12(153-165)Online publication date: 18-Dec-2013
https://dl.acm.org/doi/10.1007/978-3-319-03859-9_12
Cabezas VStanley-Marbell PCascaval CTrancoso PPrasanna V(2011)Quantitative analysis of parallelism and data movement properties across the Berkeley computational motifsProceedings of the 8th ACM International Conference on Computing Frontiers10.1145/2016604.2016625(1-2)Online publication date: 3-May-2011
https://dl.acm.org/doi/10.1145/2016604.2016625
Caparrós Cabezas VStanley-Marbell PMeyer auf der Heide FRajaraman R(2011)Parallelism and data movement characterization of contemporary application classesProceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures10.1145/1989493.1989506(95-104)Online publication date: 4-Jun-2011
https://dl.acm.org/doi/10.1145/1989493.1989506
Morano DUht AKaeli D(2010)Resource Flow MicroarchitecturesSpeculative Execution in High Performance Computer Architectures10.1201/9781420035155.ch15(393-419)Online publication date: 14-Jan-2010
https://doi.org/10.1201/9781420035155.ch15
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents