[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Open access

The design, implementation, and evaluation of Jade

Published: 01 May 1998 Publication History

Abstract

Jade is a portable, implicitly parallel language designed for exploiting task-level concurrency.Jade programmers start with a program written in a standard serial, imperative language, then use Jade constructs to declare how parts of the program access data. The Jade implementation uses this data access information to automatically extract the concurrency and map the application onto the machine at hand. The resulting parallel execution preserves the semantics of the original serial program. We have implemented Jade as an extension to C, and Jade implementations exist for s hared-memory multiprocessors, homogeneous message-passing machines, and heterogeneous networks of workstations. In this atricle we discuss the design goals and decisions that determined the final form of Jade and present an overview of the Jade implementation. We also present our experience using Jade to implement several complete scientific and engineering applications. We use this experience to evaluate how the different Jade language features were used in practice and how well Jade as a whole supports the process of developing parallel applications. We find that the basic idea of preserving the serial semantics simplifies the program development process, and that the concept of using data access specifications to guide the parallelization offers significant advantages over more traditional control-based approaches. We also find that the Jade data model can interact poorly with concurrency patterns that write disjoint pieces of a single aggregate data structure, although this problem arises in only one of the applications.

References

[1]
AMERICA, P. 1987. POOL-T: A parallel object-oriented language. In Object Oriented Concurrent Programming, A. Yonezawa and M. Tokoro, Eds. MIT Press, Cambridge, Mass., 199-220.]]
[2]
AMZA, C., Cox, t., DWARKADAS, S., KELEHER, P., Lu, H., RAJAMONY, R., Yu, W., AND ZWAENEPOEL, W. 1996. TreadMarks: Shared memory computing on networks of workstations. IEEE Comput. 29, 2 (June), 18-28.]]
[3]
APPEL, A. AND LI, K. 1991. Virtual memory primitives for user programs. In Proceedings of the Zth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York.]]
[4]
ARPACI~ R., CULLER~ D., KRISHNAMURTHY~ A., STEINBERG~ S., AND YELICK~ K. 1995. EmpiricM evaluation of the CRAY-T3D: A compiler perspective. In Proceedings of the 22nd International Symposium on Computer Architecture. ACM, New York.]]
[5]
ARVIND AND THOMAS~ R. 1981. I-structures: An efficient data type for functional languages. Tech. Rep. MIT/LCS/TM-210, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass.]]
[6]
BAL~ H., KAASHOEK~ M., AND TANENBAUM~ A. 1992. Orca: A language for parallel programming of distributed systems. IEEE Trans. Softw. Eng. 18, 3 (Mar.).]]
[7]
BASKETT~ F., JERMOLUK~ T., AND SOLOMON~ D. 1988. The 4D-MP graphics superworkstation: Computing + graphics = 40 mips + 40 mflops + 100,000 lighted polygons per second. In Proceedings of COMPCON Spring 88. 468-471.]]
[8]
BENNETT~ J., CARTER~ J., AND ZWAENEPOEL~ W. 1990. Munin: Distributed shared memory based on type-specific memory coherence. In Proceedings of the 2nd A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York.]]
[9]
BERRENDORF~ R. AND HELIN~ J. 1992. Evaluating the basic performance of the Intel iPSC/860 parallel computer. Concur. Pract. Exper. Z, 3 (May), 223-240.]]
[10]
BERSHAD~ B., ZEKAUSKAS~ M., AND SAWDON~ W. 1993. The Midway distributed shared memory system. In Proceedings of COMPCON'93. 528-537.]]
[11]
BLELLOCH~ G., CHATTERJEE~ S., HARDWICK~ J., SIPELSTEIN~ J., AND ZAGHA~ M. 1993. Implementation of a portable nested data-parallel language. In Proceedings of the ~th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York.]]
[12]
BLUMOFE, R., JOERG, C., KUSZMAUL, B., LEISERSON, C., RANDALL, K., AND ZHOU, Y. 1995. Cilk: An efficient multithreaded runtime system. In Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York.]]
[13]
BROWNING, R., LI, T., CHUI, B., YE, J., PEASE, R., CZYZEWSKI, Z., AND JoY, D. 1994. Empirical forms for the electron/atom elastic scattering cross sections from 0.1-30keV. Appl. Phys. 76, 4 (aug.), 2016-2022.]]
[14]
BROWNING, R., LI, T., CHUI, B., YE, J., PEASE, R., CZYZEWSKI, Z., AND JOY, D. 1995. Low-energy electron/atom elastic scattering cross sections for 0.1-30keV. Scanning 17, 4 (July/August), 250-253.]]
[15]
BURNS, A. 1988. Programming in Occam 2. Addison-Wesley, Reading, Mass.]]
[16]
CARRIERO, N. AND GELERNTER, D. 1989. Linda in context. Commun. ACM 32, 4 (Apr.), 444-458.]]
[17]
CHANDRA, R., GUPTA, A., AND HENNESSY, J. 1993. Data locality and load balancing in COOL. In Proceedings of the ~th A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York.]]
[18]
DAYDE, M. AND DUFF, I. 1990. Use of parallel Level 3 BLAS in LU factorization on three vector multiprocessors; the Alliant FX/80, the Cray-2, and the IBM 3090 VF. In Proceedings of the 1990 ACM International Conference on Supercomputing. ACM, New York.]]
[19]
DIETZ~ H. AND KLAPPHOLZ~ D. 1986. Refined Fortran: Another sequential language for parallel programming. In Proceedings of the 1986 International Conference on Parallel Processing, K. Hwang, S. M. Jacobs, and E. E. Swartzlander, Eds. 184-189.]]
[20]
DONGARRA~ J. AND SORENSEN~ D. 1987. SCHEDULE: Tools for developing and analyzing parallel Fortran programs. In The Characteristics of Parallel Algorithms, D. Gannon, L. Jamieson, and R. Douglass, Eds. The MIT Press, Cambridge, Mass.]]
[21]
FEO, J., CANN~ D., AND OLDEHOEFT~ R. 1990. A report on the Sisal language project. J. Parallel Distrib. Comput. 10, 4 (Dec.), 349-366.]]
[22]
FOSTER~ I. AND TAYLOR~ S. 1990. Strand: New Concepts in Parallel Programming. Prentice-HM1, Englewood Cliffs, N.J.]]
[23]
Fu, C. AND YANG, T. 1997. Space and time efficient execution of parallel irregular computations. In Proceedings of the 6th A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York.]]
[24]
GELERNTER, D. 1985. Generative communication in Linda. A CM Trans. Program. Lang. Syst. 7, 1 (Jan.), 80-112.]]
[25]
GHARACHORLOO, K. 1996. Memory consistency models for shared memory multiprocessors. Ph.D. thesis, Dept. of Electrical Engineering, Stanford Univ., Stanford, Calif.]]
[26]
GIFFORD, D., JOUVELOT, P., LUCASSEN, J., AND SHELDON, M. 1987. FX-87 reference manual. Tech. Rep. MIT/LCS/TR-407, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass. Sept.]]
[27]
GOLUB, G. AND LOAN, C. V. 1989. Matrix Computations, 2nd ed. The Johns Hopkins Univ. Press, Baltimore, Md.]]
[28]
GREGORY, S. 1987. Parallel Logic Programming in PARLOG: The Language and Its Implemenration. Addison-Wesley, Reading, Mass.]]
[29]
GROSS, T., O'HALLORAN, D., AND SUBHLOK, J. 1994. Task parallelism in a high performance Fortran framework. IEEE Parallel Distrib. Tech. 2, 3 (Fall), 16-26.]]
[30]
HAGERSTEN, E., LANDIN, t., AND HARIDI, S. 1992. DDM--A cache-only memory architecture. Computer 25, 9 (Sept.), 44-54.]]
[31]
HALSTEAD, R., JR. 1985. Multilisp: A language for concurrent symbolic computation. ACM Trans. Program. Lang. Syst. 7, 4 (Oct.), 501-538.]]
[32]
HAMMEL, R. AND GIFFORD, D. 1988. FX-87 performance measurements: Dataflow implementation. Tech. Rep. MIT/LCS/TR-421, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass. Nov.]]
[33]
HARRIS, J., LAZARATOS, S., AND MICHELENA, R. 1990. Tomographic string inversion. In Proceedings of the 60th Annual International Meeting, Society of Exploration and Geophysics, Extended Abstracts. 82-85.]]
[34]
HENDREN, L., HUMMEL, J., AND NICOLAU, t. 1992. Abstractions for recursive pointer data structures: Improving the analysis and transformation of imperative programs. In Proceedings of the SIGPLAN '92 Conference on Program Language Design and Implementation. ACM, New York.]]
[35]
HILL, M., LARUS, J., REINHARDT, I~., AND WOOD, D. 1992. Cooperative shared memory: Software and hardware for scalable multiprocessors. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 262-273.]]
[36]
HOARE, C. A. R. 1985. Communicating Sequential Processes. Prentice-Hall, Englewood Cliffs, N.J.]]
[37]
INMOS LIMITED. 1984. Occam Programming Manual. Prentice-Hall, Englewood Cliffs, N.J.]]
[38]
INTEL SUPERCOMPUTER SYSTEMS DIVISION. 1991. Paragon XP//S Product Overview. Intel Supercomputer Systems Division.]]
[39]
I~ARAMCHETI, V. AND CHIEN, t. 1995. A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D. In Proceedings of the 22nd International Symposium on Computer Architecture. ACM, New York.]]
[40]
KENDALL SQUARE RESEARCH CORPORATION. 1992. KSR-1 Technical Summary. Kendall Square Research Corp., Cambridge, Mass.]]
[41]
I~LAPPHOLZ, D. 1989. Refined Fortran: An update. In Proceedings of Supercomputing '89. IEEE Computer Society Press, Los Alamitos, Calif.]]
[42]
I~LAPPHOLZ, D., I~ALLIS, A., AND I~ONG, X. 1990. Refined C--An Update. In Languages and Compilers for Parallel Computing, D. Gelernter, A. Nicolau, and D. Padua, Eds. The MIT Press, Cambridge, Mass., 331-357.]]
[43]
KRISHNAMURTHY, t., CULLER, D., DUSSEAU, t., GOLDSTEIN, S., LUMETTA, S., VON EICKEN, T., AND YELICK, K. 1992. Parallel programming in Split-C. In Proceedings of Supercomputing '92. IEEE Computer Society Press, Los Alamitos, Calif., 262-273.]]
[44]
LAM, M. AND RINARD, M. 1991. Coarse-grain parallel programming in Jade. In Proceedings of the 3rd A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York, 94-105.]]
[45]
LENOSKI, D. 1992. The design and analysis of DASH: A scalable directory-based multiprocessor. Ph.D. thesis, Dept. of Electrical Engineering, Stanford Univ., Stanford, Calif.]]
[46]
LENOSKI, D., LAUDON, J., JOE, T., NAKAHIRA, D., STEVENS, L., GUPTA, A., AND HENNESSY, J. 1992. The DASH prototype: Implementation and performance. In Proceedings of the 19th International Symposium on Computer Architecture. ACM, New York.]]
[47]
LI, K. 1986. Shared virtual memory on loosely coupled multiprocessors. Ph.D. thesis, Dept. of Computer Science, Yale Univ., New Haven, Conn.]]
[48]
LUCASSEN, J. 1987. Types and effects: Towards the integration of functional and imperative programming. Tech. Rep. MIT/LCS/TR-408, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass. Aug.]]
[49]
LUSK, E., OVERBEEK, R., BOYLE, J., BUTLER, R., DISZ, T., GLICKFIELD, B., PATTERSON, J., AND STEVENS, R. 1987. Portable Programs for Parallel Processors. Holt, Rinehart and Winston, Inc.]]
[50]
MARTONOSI, M. AND GUPTA, t. 1989. Tradeoffs in message passing and shared memory implementations of a standard cell router. In Proceedings of the 1989 International Conference on Parallel Processing. 88-96.]]
[51]
METCALF, M. AND REID, J. 1990. Fortran 90 Explained. Oxford Science Publications.]]
[52]
MOHR, E., KRANZ, D., AND HALSTEAD, R. 1990. Lazy task creation: A technique for increasing the granularity of parallel programs. In Proceedings of the 1990 A CM Conference on Lisp and Functional Programming. ACM, New York, 185-197.]]
[53]
NIEH, J. AND LEVOY, M. 1992. Volume rendering on scalable shared-memory MIMD architectures. Tech. Rep. CSL-TR-92-537, Computer Systems Laboratory, Stanford Univ., Stanford, Calif. Aug.]]
[54]
REPPY, J. 1992. Higher-order concurrency. Ph.D. thesis, Dept. of Computer Science, Cornell Univ., Ithaca, N.Y.]]
[55]
RINARD, M. 1994a. The design, implementation and evaluation of Jade, a portable, implicitly parallel programming language. Ph.D. thesis, Dept. of Computer Science, Stanford Univ., Stanford, Calif.]]
[56]
RINARD, M. 1994b. Implicitly synchronized abstract data types: Data structures for modular parallel programming. In Proceedings of the 2nd International Workshop on Massive Parallelism: Hardware, Software and Applications, M. Furnari, Ed. World Scientific Publishing, 259-274.]]
[57]
RINARD, M. AND LAM, M. 1992. Semantic foundations of Jade. In Proceedings of the 19th Annual ACM Symposium on the Principles of Programming Languages. ACM, New York, 105-118.]]
[58]
RINARD, M., SCALES, D., AND LAM, M. 1992. Heterogeneous Parallel Programming in Jade. In Proceedings of Supercomputing '92. IEEE Computer Society Press, Los Alamitos, Calif., 245-256.]]
[59]
RINARD, M., SCALES, D., AND LAM, M. 1993. Jade: A high-level, machine-independent language for parallel programming. IEEE Comput. 26, 6 (June), 28-38.]]
[60]
ROSE, J. AND STEELE, C. 1987. C*: An extended C language for data parallel programming. Tech. Rep. PL 87-5, Thinking Machines Corp., Cambridge, Mass. Apr.]]
[61]
ROTHBERG, E. 1993. Exploiting the memory hierarchy in sequential and parallel sparse cholesky factorization. Ph.D. thesis, Dept. of Computer Science, Stanford Univ., Stanford, Calif.]]
[62]
SALMON, J. K. 1990. Parallel hierarchical N-body methods. Ph.D. thesis, California Institute of Technology.]]
[63]
SANDU, H., GAMSA, B., AND ZHOU, S. 1993. The shared regions approach to software cache coherence on multiprocessors. In Proceedings of the gth A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York, 229-238.]]
[64]
SCALES, D. AND LAM, M. S. 1994. The design and evaluation of a shared object system for distributed memory machines. In Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation. ACM, New York.]]
[65]
SCALES, D., GHARACHORLOO, K., AND THEKKATH, C. 1994. Shasta: A low overhead, software-only approach for supporting fine-grain shared memory. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York.]]
[66]
SHOINAS, I., FALSAFI, B., LEBECK, A., REINHARDT, S., LARUS, J., AND WOOD, D. 1994. Fine-grain access control for distributed shared memory. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York.]]
[67]
SINGH, J. 1993. Parallel hierarchical N-body methods and their implications for multiprocessors. Ph.D. thesis, Dept. of Electrical Engineering, Stanford Univ., Stanford, Calif.]]
[68]
SINGH, J. AND HENNESSY, J. 1992. Finding and exploiting parallelism in an ocean simulation program: Experiences, results, and implications. J. Parallel Distrib. Comput. 15, 1 (May), 27-48.]]
[69]
SINGH, J., WEBER, W., AND GUPTA, A. 1992. SPLASH: Stanford parallel applications for shared memory. Comput. Arch. News 20, 1 (Mar.), 5-44.]]
[70]
SUNDERAM, V. 1990. PVM: A framework for parallel distributed computing. Concur. Pract. Ezper. 2, 4 (Dec.), 315-339.]]
[71]
THINKING MACHINES CORPORATION. 1991. The Connection Machine CM-5 Technical Summary. Thinking Machines Corp., Cambridge, Mass.]]
[72]
TRAUB, K. 1991. Implementation of Non-strict Functional Programming Languages. The MIT Press, Cambridge, Mass.]]
[73]
Woo, S., OHARA, ~/{., TORRIE, E., SINGH, J., AND GUPTA, A. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture. ACM, New York.]]
[74]
YONEZAWA, t., BRIOT, J.-P., AND SHIBAYAMA, E. 1986. Object oriented concurrent programming in ABCL/1. In Proceedings of the 1st Annual Conference on Object-Oriented Programming Systems, Languages and Applications. ACM, New York, 258-268.]]

Cited By

View all
  • (2024)A Review on Multi-Agent Systems and JADE Applications in Microgrids2024 12th International Conference on Smart Grid (icSmartGrid)10.1109/icSmartGrid61824.2024.10578156(623-628)Online publication date: 27-May-2024
  • (2022)Understanding and Reaching the Performance Limit of Schedule Tuning on Stable Synchronization DeterminismProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569669(223-238)Online publication date: 8-Oct-2022
  • (2022)A Comprehensive Exploration of Languages for Parallel ComputingACM Computing Surveys10.1145/348500855:2(1-39)Online publication date: 18-Jan-2022
  • Show More Cited By

Recommendations

Reviews

M. S. Joy

Jade is an extension of C that is designed to be run on a parallel machine. It is a sequential language, and process concurrency is introduced automatically by analysis of the source code. Jade uses constructs to indicate where data within a program may be accessed by multiple processes. If all such constructs are edited out of a program, the result is a C program. Furthermore, the semantics of a Jade program is identical to that of the sequential C program which it extends; this equivalence speeds up the development of parallel programs. Jade has been implemented and tested on a variety of parallel platforms, including a shared-memory machine (the Stanford DASH) and a message-passing platform (the Intel iPSC/860). Statistics are presented comparing six test programs using both platforms and between 1 and 32 processors. The speedup is almost linear for many of the tests where the process granularity is not too fine, but overhead related to process creation causes poor performance in examples where too many short-lived processes are created. This valuable, interesting, and exceptionally clear paper addresses all aspects of the design and implementation of the language. Although Jade is not meant to be a general-purpose programming language, since it cannot express certain kinds of parallel algorithm, the authors have demonstrated the viability of their paradigm when applied to suitable programming tasks.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems
ACM Transactions on Programming Languages and Systems  Volume 20, Issue 3
May 1998
248 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/291889
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1998
Published in TOPLAS Volume 20, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. parallel computing
  2. parallel programming languages

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)177
  • Downloads (Last 6 weeks)25
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Review on Multi-Agent Systems and JADE Applications in Microgrids2024 12th International Conference on Smart Grid (icSmartGrid)10.1109/icSmartGrid61824.2024.10578156(623-628)Online publication date: 27-May-2024
  • (2022)Understanding and Reaching the Performance Limit of Schedule Tuning on Stable Synchronization DeterminismProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569669(223-238)Online publication date: 8-Oct-2022
  • (2022)A Comprehensive Exploration of Languages for Parallel ComputingACM Computing Surveys10.1145/348500855:2(1-39)Online publication date: 18-Jan-2022
  • (2020)Deterministic Atomic Buffering2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00083(981-995)Online publication date: Oct-2020
  • (2019)Processor-Oblivious Record and ReplayACM Transactions on Parallel Computing10.1145/33656596:4(1-28)Online publication date: 17-Dec-2019
  • (2019)PlanAlyzer: assessing threats to the validity of online experimentsProceedings of the ACM on Programming Languages10.1145/33606083:OOPSLA(1-30)Online publication date: 10-Oct-2019
  • (2019)Coverage guided, property based testingProceedings of the ACM on Programming Languages10.1145/33606073:OOPSLA(1-29)Online publication date: 10-Oct-2019
  • (2019)Trace aware random testing for distributed systemsProceedings of the ACM on Programming Languages10.1145/33606063:OOPSLA(1-29)Online publication date: 10-Oct-2019
  • (2019)Dependence-aware, unbounded sound predictive race detectionProceedings of the ACM on Programming Languages10.1145/33606053:OOPSLA(1-30)Online publication date: 10-Oct-2019
  • (2019)Lazy Determinism for Faster Deterministic MultithreadingProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304047(879-891)Online publication date: 4-Apr-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media