[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Relaxing SIMD control flow constraints using loop transformations

Published: 01 July 1992 Publication History

Abstract

Many loop nests in scientific codes contain a parallelizable outer loop but have an inner loop for which the number of iterations varies between different iterations of the outer loop. When running this kind of loop nest on a SIMD machine, the SIMD-inherent restriction to single program counter common to all processors will cause a performance degradation relative to comparable MIMD implementations. This problem is not due to limited parallelism or bad load balance, it is merely a problem of control flow.
This paper presents a loop transformation, which we call loop flattening, that overcomes this limitation by letting each processor advance to the next loop iteration containing useful computation, if there is such an iteration for the given processor. We study a concrete example derived from a molecular dynamics code and compare performance results for flattened and unflattened versions of this kernel on two SIMD machines, the CM-2 and the DECmpp 12000. We then evaluate loop flattening from the compiler's perspective in terms of applicability, cost, profitability, and safety. We conclude with arguing that loop flattening, whether performed by the programmer or by the compiler, introduces negligible overhead and can significantly improve the performance of scientific codes for solving irregular problems.

Cited By

View all
  • (2017)Combining loop unrolling strategies and code predication to reduce the worst-case execution time of real-time softwareApplied Computing and Informatics10.1016/j.aci.2017.03.00213:2(184-193)Online publication date: Jul-2017
  • (1994)Emulating MIMD Behavior on SIMD MachinesMassively Parallel Processing Applications and Development10.1016/B978-0-444-81784-6.50042-7(313-320)Online publication date: 1994
  • (1994)Project Triton: Towards Improved Programmability of Parallel ComputersThe Interaction of Compilation Technology and Computer Architecture10.1007/978-1-4615-2684-1_10(249-281)Online publication date: 1994
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 27, Issue 7
July 1992
352 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/143103
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '92: Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
    July 1992
    352 pages
    ISBN:0897914759
    DOI:10.1145/143095
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1992
Published in SIGPLAN Volume 27, Issue 7

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Combining loop unrolling strategies and code predication to reduce the worst-case execution time of real-time softwareApplied Computing and Informatics10.1016/j.aci.2017.03.00213:2(184-193)Online publication date: Jul-2017
  • (1994)Emulating MIMD Behavior on SIMD MachinesMassively Parallel Processing Applications and Development10.1016/B978-0-444-81784-6.50042-7(313-320)Online publication date: 1994
  • (1994)Project Triton: Towards Improved Programmability of Parallel ComputersThe Interaction of Compilation Technology and Computer Architecture10.1007/978-1-4615-2684-1_10(249-281)Online publication date: 1994
  • (2024)MIMD Programs Execution Support on SIMD Machines: A Holistic SurveyIEEE Access10.1109/ACCESS.2024.337299012(34354-34377)Online publication date: 2024
  • (2022)Loop Rolling for Code Size Reduction2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO53902.2022.9741256(217-229)Online publication date: 2-Apr-2022
  • (2015)Single-Instruction Multiple-Data ExecutionSynthesis Lectures on Computer Architecture10.2200/S00647ED1V01Y201505CAC03210:1(1-121)Online publication date: 27-May-2015
  • (2015)Exploring and Evaluating Array Layout Restructuring for SIMDizationLanguages and Compilers for Parallel Computing10.1007/978-3-319-17473-0_23(351-366)Online publication date: 1-May-2015
  • (2014)A Portable Optimization Engine for Accelerating Irregular Data-Traversal Applications on SIMD ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/263221511:2(1-31)Online publication date: 1-Jun-2014
  • (2013)Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDGProceedings of the 22nd international conference on Parallel architectures and compilation techniques10.5555/2523721.2523767(341-352)Online publication date: 7-Oct-2013
  • (2013)SIMD parallelization of applications that traverse irregular data structuresProceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO.2013.6494989(1-10)Online publication date: 23-Feb-2013
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media