[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/224170.224337acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article
Free access

Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Published: 08 December 1995 Publication History

Abstract

This paper presents an extensive empirical evaluation of an interprocedural parallelizing compiler, developed as part of the Stanford SUIF compiler system. The system incorporates a comprehensive and integrated collection of analyses, including privatization and reduction recognition for both array and scalar variables, and symbolic analysis of array subscripts. The interprocedural analysis framework is designed to provide analysis results nearly as precise as full inlining but without its associated costs. Experimentation with this system shows that it is capable of detecting coarser granularity of parallelism than previously possible. Specifically, it can parallelize loops that span numerous procedures and hundreds of lines of codes, frequently requiring modifications to array data structures such as privatization and reduction transformations. Measurements from several standard benchmark suites demonstrate that an integrated combination of interprocedural analyses can substantially advance the capability of automatic parallelization technology.

References

[1]
J. M. Anderson, S. P. Amarasinghe, and M. S. Lam. Data and computation transformations for multiprocessors. In Proceedings of the Fifth ACM SIG- PLAN Symposium on Principles and Practice of Parallel Programming, July 1995.]]
[2]
B. Blume, R. Eigenmann, K. Faigin, J. Grout, Jay Hoe inger, D. Padua, P. Petersen, B. Pottenger, L. Rauchwerger, P. Tu, and S. Weatherford. Polaris: The next generation in parallelizing compilers. In Proceedings of the Seventh Annual Workshop on Languages and Compilers for Parallel Computing, August 1994.]]
[3]
W. Blume and R. Eigenmann. Performance analysis of parallelizing compilers on the Perfect Benchmarks programs. IEEE Transactions on Parallel and Distributed Systems, 3(6):643-656, November 1992.]]
[4]
W. Blume and R. Eigenmann. The range test: A dependence test for symbolic, non-linear expressions. In Proceedings of Supercomputing '94. IEEE Press, November 1994.]]
[5]
K. Cooper, M.W. Hall, and K. Kennedy. A methodology for procedure cloning. Computer Languages, 19(2), April 1993.]]
[6]
B. Creusillet and F. Irigoin. Interprocedural array region analyses. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag, August 1995.]]
[7]
G. Goff, K. Kennedy, and C.-W. Tseng. Practical dependence testing. In Proceedings of the SIGPLAN '91 Conference onProgramming Language Design and Implementation, Toronto, Canada, June 1991.]]
[8]
M. Haghighat and C. Polychronopoulos. Symbolic analysis: A basis for parallelization, optimization, and scheduling of programs. In Proceedings of the Sixth Workshop on Languages and Compilers for Parallel Computing, Portland, OR, August 1993.]]
[9]
M. Hall, B. Murphy, S. Amarasinghe, S. Liao, and M. Lam. Interprocedural analysis for parallelization. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag, August 1995.]]
[10]
M. W. Hall, S. Amarasinghe, and B. Murphy. Interprocedural analysis for parallelization: Design and experience. In Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, pages 650-655, San Francisco, CA, February 1995.]]
[11]
M. W. Hall, J. Mellor-Crummey, A. Carle, and R. Rodriguez. FIAT: A framework for interprocedural analysis and transformation. In Proceedings of the Sixth Workshop on Languages and Compilers for Parallel Computing, Portland, OR, August 1993.]]
[12]
P. Havlak. Interprocedural symbolic analysis. PhD thesis, Rice University, Dept. of Computer Science, May 1994.]]
[13]
P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3):350-360, July 1991.]]
[14]
M. Hind, M. Burke, P. Carini, and S. Midkiff. An empirical study of precise interprocedural array analysis. Scientific Programming, 3(3):255-271, 1994.]]
[15]
F. Irigoin. Interprocedural analyses for programming environments. In NSF- CNRS Workshop on Evironments and Tools for Parallel Scientific Programming, September 1992.]]
[16]
F. Irigoin, P. Jouvelot, and R. Triolet. Semantical interprocedural parallelization: An overview of the PIPS project. In Proceedings of the 1991 ACM International Conference on Supercomputing, Cologne, Germany, June 1991.]]
[17]
W. Landi and B.G. Ryder. A safe approximate algorithm for interprocedural pointer aliasing. In SIGPLAN '92 Conference on Programming Language Design and Implementation, SIGPLAN Notices 27(7), pages 235-248, July 1992.]]
[18]
Z. Li and P. Yew. Efficient interprocedural analysis for program restructuring for parallel programs. In Proceedings of the ACM SIGPLAN Symposium on Parallel Programming: Experience with Applications, Languages, and Systems (PPEALS), New Haven, CT, July 1988.]]
[19]
V. Maslov. Delinearization: An efficient way to break multiloop dependence equations. In Proceedings of the ACM SIGPLAN '92 Conference on Programming Language Design and Implementation, June 1992.]]
[20]
D. E. Maydan, J. L. Hennessy, and M. S. Lam. Efficient and exact data dependence analysis. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991.]]
[21]
R. Metzger and P. Smith. The CONVEX application compiler. Fortran Journal, 3(1):8-10, 1991.]]
[22]
E. Myers. A precise inter-procedural data ow algorithm. In Conference Record of the Eighth Annual Symposium on Principles of Programming Languages. ACM, January 1981.]]
[23]
J. P. Singh and J. L. Hennessy. An empirical investigation of the effectiveness of and limitations of automatic parallelization. In Proceedings of the International Symposium on Shared Memory Multiprocessors, Tokyo, Japan, April 1991.]]
[24]
R. Triolet, F. Irigoin, and P. Feautrier. Direct parallelization of call statements. In Proceedings of the SIGPLAN '86 Symposium on Compiler Construction, SIGPLAN Notices 21(7), pages 176-185. ACM, July 1986.]]
[25]
C-W. Tseng. Compiler optimizations for eliminating barrier synchronization. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, July 1995.]]
[26]
P. Tu and D. Padua. Automatic array privatization. In Proceedings of the Sixth Workshop on Languages and Compilers for Parallel Computing, Portland, OR, August 1993.]]
[27]
M. E. Wolf. Improving Locality and Parallelism in Nested Loops. PhD thesis, Dept. of Computer Science, Stanford University, August 1992.]]

Cited By

View all
  • (2022)VICOProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532393(1-14)Online publication date: 28-Jun-2022
  • (2022)Fine-Granular Computation and Data Layout Reorganization for Improving LocalityProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549386(1-9)Online publication date: 30-Oct-2022
  • (2021)Mix and Match: Reorganizing Tasks for Enhancing Data LocalityProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34600875:2(1-24)Online publication date: 4-Jun-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing
December 1995
875 pages
ISBN:0897918169
DOI:10.1145/224170
  • Chairman:
  • Sid Karin
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 1995

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. compiler optimizations
  2. interprocedural data-flow analysis
  3. parallelizing compilers
  4. shared memory multiprocessors

Qualifiers

  • Article

Conference

SC '95
Sponsor:

Acceptance Rates

Supercomputing '95 Paper Acceptance Rate 69 of 241 submissions, 29%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)49
  • Downloads (Last 6 weeks)15
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)VICOProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532393(1-14)Online publication date: 28-Jun-2022
  • (2022)Fine-Granular Computation and Data Layout Reorganization for Improving LocalityProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549386(1-9)Online publication date: 30-Oct-2022
  • (2021)Mix and Match: Reorganizing Tasks for Enhancing Data LocalityProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34600875:2(1-24)Online publication date: 4-Jun-2021
  • (2021)Distance-in-time versus distance-in-spaceProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454069(665-680)Online publication date: 19-Jun-2021
  • (2021)Compiler support for near data computingProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441600(90-104)Online publication date: 17-Feb-2021
  • (2019)The trade-offs between write and readCommunications of the ACM10.1145/335933462:11(111-113)Online publication date: 24-Oct-2019
  • (2019)The positive and negative effects of social media in IndiaCommunications of the ACM10.1145/334567162:11(98-99)Online publication date: 24-Oct-2019
  • (2019)The internet of the oralsCommunications of the ACM10.1145/334345262:11(100-103)Online publication date: 24-Oct-2019
  • (2019)Computing research at Tata Consultancy ServicesCommunications of the ACM10.1145/334344362:11(62-63)Online publication date: 24-Oct-2019
  • (2019)The five-minute rule 30 years later and its impact on the storage hierarchyCommunications of the ACM10.1145/331816362:11(114-120)Online publication date: 24-Oct-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media