[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/782814.782842acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

A GSA-based compiler infrastructure to extract parallelism from complex loops

Published: 23 June 2003 Publication History

Abstract

This paper presents a new approach for the detection of coarse-grain parallelism in loop nests that contain complex computations, including subscripted subscripts as well as conditional statements that introduce complex control flows at run-time. The approach is based on the recognition of the computational kernels calculated in a loop without considering the semantics of the code. The detection is carried out on top of the Gated Single Assignment (GSA) program representation at two different levels. First, the use-def chains between the statements that compose the strongly connected components (SCCs) of the GSA use-def chain graph are analyzed (intra-SCC analysis). As a result, the kernel computed in each SCC is recognized. Second, the use-def chains between statements of different SCCs are examined (inter-SCC analysis). This second abstraction level enables the detection of more complex computational kernels by the compiler. A prototype was implemented using the infrastructure provided by the Polaris compiler. Experimental results that show the effectiveness of our approach for the detection of coarse-grain parallelism in a suite of real codes are presented.

References

[1]
M. Arenaz. Compiler Framework for the Automatic Detection of Loop-Level Parallelism. PhD thesis, Department of Electronics and Systems, University of A Coruña, Spain, Mar. 2003. Available at www.des.udc.es/~juan/publicationsjuan.html.]]
[2]
M. Arenaz, J. Touriño, and R. Doallo. A compiler framework to detect parallelism in irregular codes. In 14th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2001, Cumberland Falls, KY, Aug. 2001.]]
[3]
M. Arenaz, J. Touriño, and R. Doallo. Run-time support for parallel irregular assignments. In 6th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, LCR'02, Washington DC, Mar. 2002.]]
[4]
W. Blume, R. Doallo, R. Eigenmann, J. Grout, J. Hoeflinger, T. Lawrence, J. Lee, D. A. Padua, Y. Paek, W. M. Pottenger, L. Rauchwerger, and P. Tu. Parallel programming with Polaris. IEEE Computer, 29(12):78--82, Dec. 1996.]]
[5]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451--490, Oct. 1991.]]
[6]
M. P. Gerlek, E. Stoltz, and M. Wolfe. Beyond induction variables: Detecting and classifying sequences using a demand-driven SSA. ACM Transactions on Programming Languages and Systems, 17(1):85--122, Jan. 1995.]]
[7]
E. Gutiérrez, O. G. Plata, and E. L. Zapata. Balanced, locality-based parallel irregular reductions. In 14th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2001, Cumberland Falls, KY, Aug. 2001.]]
[8]
C. W. Keßler. Applicability of program comprehension to sparse matrix computations. In 3rd International European Conference on Parallel Processing, Euro-Par'97, pages 347--351, Passau, Germany, Aug. 1997.]]
[9]
K. Knobe and V. Sarkar. Array SSA form and its use in parallelization. In 25th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1998, pages 107--120, San Diego, CA, Jan. 1998.]]
[10]
Y. Lin and D. A. Padua. On the automatic parallelization of sparse and irregular Fortran programs. In 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, LCR'98, pages 41--56, Pittsburgh, PA, May 1998.]]
[11]
M. J. Martín, D. E. Singh, J. Touriño, and F. F. Rivera. Exploiting locality in the run-time parallelization of irregular loops. In 31st International Conference on Parallel Processing, ICPP 2002, pages 27--34, Vancouver, Canada, Aug. 2002.]]
[12]
W. M. Pottenger and R. Eigenmann. Idiom recognition in the Polaris parallelizing compiler. In 9th ACM International Conference on Supercomputing, ICS'95, pages 444--448, Barcelona, Spain, July 1995.]]
[13]
K. Psarris and K. Kyriakopoulos. Data dependence testing in practice. In 1999 International Conference on Parallel Architectures and Compilation Techniques, PACT'99, pages 264--273, Newport Beach, CA, Oct. 1999.]]
[14]
Y. Saad. SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations. Available at www-users.cs.umn.edu/~saad/software/SPARS-KIT/sparskit.html.]]
[15]
T. Suganuma, H. Komatsu, and T. Nakatani. Detection and global optimization of reduction operations for distributed parallel machines. In 10th ACM International Conference on Supercomputing, ICS'96, pages 18--25, Philadelphia, PA, May 1996.]]
[16]
P. Tu and D. A. Padua. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In 9th ACM International Conference on Supercomputing, ICS'95, pages 414--423, Barcelona, Spain, July 1995.]]
[17]
M. Wolfe. Beyond induction variables. In 1992 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI'92, pages 162--174, San Francisco, CA, June 1992.]]
[18]
M. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.]]
[19]
P. Wu, A. Cohen, J. Hoeflinger, and D. A. Padua. Monotonic evolution: An alternative to induction variable substitution for dependence analysis. In 15th ACM International Conference on Supercomputing, ICS'01, pages 78--91, Sorrento, Italy, June 2001.]]
[20]
C.-Z. Xu and V. Chaudhary. Time stamp algorithms for runtime parallelization of DOACROSS loops with dynamic dependences. IEEE Transactions on Parallel and Distributed Systems, 12(5):433--450, May 2001.]]
[21]
H. Yu and L. Rauchwerger. Adaptive reduction parallelization techniques. In 14th ACM International Conference on Supercomputing, ICS'00, pages 66--77, Santa Fe, NM, May 2000.]]
[22]
F. Zhang and E. H. D'Hollander. Enhancing parallelism by removing cyclic data dependencies. In 6th International PARLE Conference, Parallel Architectures and Languages Europe, PARLE'94, pages 387--397, Athens, Greece, July 1994.]]

Cited By

View all
  • (2015)Experiences in extending parallware to support OpenACCProceedings of the Second Workshop on Accelerator Programming using Directives10.1145/2832105.2832112(1-12)Online publication date: 15-Nov-2015
  • (2012)HERCULESProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum10.1109/IPDPSW.2012.69(574-583)Online publication date: 21-May-2012
  • (2011)Data locality and parallelism optimization using a constraint-based approachJournal of Parallel and Distributed Computing10.1016/j.jpdc.2010.08.00571:2(280-287)Online publication date: 1-Feb-2011
  • Show More Cited By

Index Terms

  1. A GSA-based compiler infrastructure to extract parallelism from complex loops

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '03: Proceedings of the 17th annual international conference on Supercomputing
    June 2003
    380 pages
    ISBN:1581137338
    DOI:10.1145/782814
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 June 2003

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. GSA
    2. loop-level kernel recognition
    3. parallelizing compilers
    4. strongly connected components

    Qualifiers

    • Article

    Conference

    ICS03
    Sponsor:
    ICS03: International Conference on Supercomputing 2003
    June 23 - 26, 2003
    CA, San Francisco, USA

    Acceptance Rates

    ICS '03 Paper Acceptance Rate 36 of 171 submissions, 21%;
    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)Experiences in extending parallware to support OpenACCProceedings of the Second Workshop on Accelerator Programming using Directives10.1145/2832105.2832112(1-12)Online publication date: 15-Nov-2015
    • (2012)HERCULESProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum10.1109/IPDPSW.2012.69(574-583)Online publication date: 21-May-2012
    • (2011)Data locality and parallelism optimization using a constraint-based approachJournal of Parallel and Distributed Computing10.1016/j.jpdc.2010.08.00571:2(280-287)Online publication date: 1-Feb-2011
    • (2010)Code scheduling for optimizing parallelism and data localityProceedings of the 16th international Euro-Par conference on Parallel processing: Part I10.5555/1887695.1887718(204-216)Online publication date: 31-Aug-2010
    • (2010)Code Scheduling for Optimizing Parallelism and Data LocalityEuro-Par 2010 - Parallel Processing10.1007/978-3-642-15277-1_20(204-216)Online publication date: 2010
    • (2008)XARKACM Transactions on Programming Languages and Systems10.1145/1391956.139195930:6(1-56)Online publication date: 30-Oct-2008
    • (2008)Efficiently Building the Gated Single Assignment Form in Codes with Pointers in Modern Optimizing CompilersProceedings of the 14th international Euro-Par conference on Parallel Processing10.1007/978-3-540-85451-7_39(360-369)Online publication date: 26-Aug-2008
    • (2007)Program behavior characterization through advanced kernel recognitionProceedings of the 13th international Euro-Par conference on Parallel Processing10.5555/2391541.2391572(237-247)Online publication date: 28-Aug-2007
    • (2007)Precise automatable analytical modeling of the cache behavior of codes with indirectionsACM Transactions on Architecture and Code Optimization10.1145/1275937.12759404:3(16-es)Online publication date: 1-Sep-2007
    • (2007)Program Behavior Characterization Through Advanced Kernel RecognitionEuro-Par 2007 Parallel Processing10.1007/978-3-540-74466-5_27(237-247)Online publication date: 2007
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media