[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Limits of control flow on parallelism

Published: 01 April 1992 Publication History

Abstract

This paper discusses three techniques useful in relaxing the constraints imposed by control flow on parallelism: control dependence analysis, executing multiple flows of control simultaneously, and speculative execution. We evaluate these techniques by using trace simulations to find the limits of parallelism for machines that employ different combinations of these techniques. We have three major results. First, local regions of code have limited parallelism, and control dependence analysis is useful in extracting global parallelism from different parts of a program. Second, a superscalar processor is fundamentally limited because it cannot execute independent regions of code concurrently. Higher performance can be obtained with machines, such as multiprocessors and dataflow machines, that can simultaneously follow multiple flows of control. Finally, without speculative execution to allow instructions to execute before their control dependences are resolved, only modest amounts of parallelism can be obtained for programs with complex control flow.

References

[1]
D. Bernstein and M. Rodeh. Global Instruction Scheduling for Superscalar Machines. In Proceedings of the ACM SIGPLAN'91 Conference on Programming Language Design and Implementation, pages 241-255, June 1991.
[2]
R. P. Colwell, R. P. Nix, j. J. O'Donnell, D. B. Papworth, and P. K. Rodman. A VLIW Architecture for a Trace Scheduling Compiler. IEEE Transactions on Computers, C-37(8):967-979, August 1988. Also in ASPLOS-3 pp. 180-192.
[3]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. An Efficient Method of Computing Static Single Assignment Form. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages, pages 25-35, January 1989.
[4]
J. A. Fisher. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Transactions on Computers, C-30(7):478-490, July 1981.
[5]
P. Y. Hsu. Highly Concurrent Scalar Processing. PhD thesis, University of Illinois at Urbana-Champaign, 1986.
[6]
W. W. Hwu, T. M. Conte, and P. P. Chang. Comparing Software and Hardware Schemes For Reducing the Cost of Branches. In Proceedings of the 16th Annual International Symposium on Computer Architecture, pages 224-233, May 1989.
[7]
M. Johnson. Superscalar Microprocessor Design. Prentice Hall, Englewood Cliffs, NJ, 1990.
[8]
S. McFarling and J. Hennessy. Reducing the Cost of Branches. In Proceedings of the 13th Annual International Symposium on Computer Architecture, pages 396--404, June 1986.
[9]
K. Murakami, N. lrie, M. Kuga, and S. Tomitao SIMP (Single Instruction Stream/Multiple Instruction Pipelining): A Novel High-Speed Single-Processor Architecture. in Proceedings of the 16th Annual International Symposium on Computer Architecture, pages 78-85, May 1989.
[10]
A. Nicolau and J. A. Fisher. Measuring the Parallelism Available for Very Long Instruction Word Architectures. IEEE Transactions on Computers, C- 33(11):968-976, November 1984.
[11]
Y. N. Patt, S. W. Melvin, W. W. Hwu, and M. Shebanow. Critical Issues Regarding HPS, A High Performance Microarchitecture, In Proceedings of the 18th Annual Workshop on Microprogramming, pages 109- 116, December 1985.
[12]
E. M. Riseman and C. C. Foster. The Inhibition of Potential Parallelism by Conditional Jumps. IEEE Transactions on Computers, C-21(12):1405-1411, December 1972.
[13]
J. E. Smith and A. R. Pleszkun. Implementation of Precise Interrupts in Pipelined Processors. In Proceedings of the 12th Annual International Symposium on Computer Architecture, pages 36-44, June 1985.
[14]
M. D. Smith, M. johnson, and M. A. Horowitz. Limits on Multiple Instruction Issue. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 290-302, April 1989.
[15]
M. D. Smith, M. S. Lam, and M. A. Horowitz. Boosting Beyond Static Scheduling in a Superscalar Processor. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 344-354, May 1990.
[16]
G. S. Sohi and S. Vajapeyam. Instruction Issue Logic for High-Performance, Interruptible Pipelined Processors. In Proceedings of the 14th Annual international Symposium on Computer Architecture, pages 27-34, June 1987.
[17]
D. W. Wall. Limits of Instruction-Level Parallelism. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 176-188, April 1991.

Cited By

View all
  • (2021)Study of Fine-grained Nested Parallelism in CDCL SAT SolversACM Transactions on Parallel Computing10.1145/34706398:3(1-18)Online publication date: 20-Sep-2021
  • (2015)Research Infrastructures for Hardware AcceleratorsSynthesis Lectures on Computer Architecture10.2200/S00677ED1V01Y201511CAC03410:4(1-99)Online publication date: 18-Nov-2015
  • (2015)Toward a Core Design to Distribute an Execution on a Manycore ProcessorProceedings of the 13th International Conference on Parallel Computing Technologies - Volume 925110.1007/978-3-319-21909-7_38(390-404)Online publication date: 31-Aug-2015
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 20, Issue 2
Special Issue: Proceedings of the 19th annual international symposium on Computer architecture (ISCA '92)
May 1992
429 pages
ISSN:0163-5964
DOI:10.1145/146628
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture
    May 1992
    439 pages
    ISBN:0897915097
    DOI:10.1145/139669

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1992
Published in SIGARCH Volume 20, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)155
  • Downloads (Last 6 weeks)9
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Study of Fine-grained Nested Parallelism in CDCL SAT SolversACM Transactions on Parallel Computing10.1145/34706398:3(1-18)Online publication date: 20-Sep-2021
  • (2015)Research Infrastructures for Hardware AcceleratorsSynthesis Lectures on Computer Architecture10.2200/S00677ED1V01Y201511CAC03410:4(1-99)Online publication date: 18-Nov-2015
  • (2015)Toward a Core Design to Distribute an Execution on a Manycore ProcessorProceedings of the 13th International Conference on Parallel Computing Technologies - Volume 925110.1007/978-3-319-21909-7_38(390-404)Online publication date: 31-Aug-2015
  • (2014)Designing and implementing control flow graph for magic 4th generation languageActa Cybernetica10.14232/actacyb.21.3.2014.921:3(419-437)Online publication date: 1-Aug-2014
  • (2013)Limits of Instruction-Level Parallelism CaptureProcedia Computer Science10.1016/j.procs.2013.05.33418(1664-1673)Online publication date: 2013
  • (2013)A Software-Based Method-Level Speculation Framework for the Java PlatformLanguages and Compilers for Parallel Computing10.1007/978-3-642-37658-0_14(205-219)Online publication date: 2013
  • (2013)Improving Continuation-Powered Method-Level Speculation for JVM ApplicationsAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-03859-9_12(153-165)Online publication date: 18-Dec-2013
  • (2011)Quantitative analysis of parallelism and data movement properties across the Berkeley computational motifsProceedings of the 8th ACM International Conference on Computing Frontiers10.1145/2016604.2016625(1-2)Online publication date: 3-May-2011
  • (2011)Parallelism and data movement characterization of contemporary application classesProceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures10.1145/1989493.1989506(95-104)Online publication date: 4-Jun-2011
  • (2010)Resource Flow MicroarchitecturesSpeculative Execution in High Performance Computer Architectures10.1201/9781420035155.ch15(393-419)Online publication date: 14-Jan-2010
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media