Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- short-paperMay 2016
Network-Managed Virtual Global Address Space for Message-driven Runtimes
HPDC '16: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed ComputingPages 15–18https://doi.org/10.1145/2907294.2907320Maintaining a scalable high-performance virtual global address space using distributed memory hardware has proven to be challenging. In this paper we evaluate a new approach for such an active global address space that leverages the capabilities of the ...
- research-articleOctober 2014
Alembic: automatic locality extraction via migration
OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & ApplicationsPages 879–894https://doi.org/10.1145/2660193.2660194Partitioned Global Address Space (PGAS) environments simplify writing parallel code for clusters because they make data movement implicit - dereferencing global pointers automatically moves data around. However, it does not free the programmer from ...
Also Published in:
ACM SIGPLAN Notices: Volume 49 Issue 10 - posterAugust 2014
Coarrays in GNU Fortran
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilationPages 513–514https://doi.org/10.1145/2628071.2671427Coarray Fortran is a set of features of the Fortran 2008 standard which makes Fortran a PGAS language. Currently, the coarray support is provided mainly by commercial compilers like Cray and Intel. In this work we present two coarray implementations on ...
- posterAugust 2014
Stratified Sampling for Even Workload Partitioning
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilationPages 503–504https://doi.org/10.1145/2628071.2671422This work presents a novel algorithm, Workload Partitioning and Scheduling (WPS), for evenly partitioning the computational workload of large implicitly-defined work-list based applications on distributed/shared-memory systems. WPS uses stratified ...
- posterJune 2013
Improving performance of openSHMEM reference library by portable PE mapping technique
ICS '13: Proceedings of the 27th international ACM conference on International conference on supercomputingPages 485–486https://doi.org/10.1145/2464996.2467279Reducing data communication cost is a critical performance consideration and the need is more acute when using libraries like the OpenSHMEM Reference library which has to sacrifice some performance optimizations for portability. Being a Partitioned ...
- research-articleJune 2012
Exploring cross-layer power management for PGAS applications on the SCC platform
HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed ComputingPages 235–246https://doi.org/10.1145/2287076.2287113High-performance parallel computing architectures are increasingly based on multi-core processors. While current commercially available processors are at 8 and 16 cores, technological and power constraints are limiting the performance growth of the ...
- research-articleMay 2012
GA-GPU: extending a library-based global address spaceprogramming model for scalable heterogeneouscomputing systems
CF '12: Proceedings of the 9th conference on Computing FrontiersPages 53–64https://doi.org/10.1145/2212908.2212918Scalable heterogeneous computing (SHC) architectures are emerging as a response to new requirements for low cost, power efficiency, and high performance. For example, numerous contemporary HPC systems are using commodity Graphical Processing Units (GPU) ...
- posterMay 2011
SRC: OpenSHMEM library development
ICS '11: Proceedings of the international conference on SupercomputingPage 374https://doi.org/10.1145/1995896.1995957OpenSHMEM is a PGAS programming library implementing an RMA-based point-to-point and collective communication paradigm which decouples data motion from synchronization. This results in a more scalable programming model than more common two-sided ...
- research-articleMay 2010
Enabling a highly-scalable global address space model for petascale computing
CF '10: Proceedings of the 7th ACM international conference on Computing frontiersPages 207–216https://doi.org/10.1145/1787275.1787326Over the past decade, the trajectory to the petascale has been built on increased complexity and scale of the underlying parallel architectures. Meanwhile, software developers have struggled to provide tools that maintain the productivity of ...
- research-articleMay 2010
Hybrid parallel programming with MPI and unified parallel C
CF '10: Proceedings of the 7th ACM international conference on Computing frontiersPages 177–186https://doi.org/10.1145/1787275.1787323The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global ...
- research-articleFebruary 2009
Efficient, portable implementation of asynchronous multi-place programs
- Ganesh Bikshandi,
- Jose G. Castanos,
- Sreedhar B. Kodali,
- V. Krishna Nandivada,
- Igor Peshansky,
- Vijay A. Saraswat,
- Sayantan Sur,
- Pradeep Varma,
- Tong Wen
PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 271–282https://doi.org/10.1145/1504176.1504215The X10 programming language is organized around the notion of places (an encapsulation of data and activities operating on the data), partitioned global address space (PGAS), and asynchronous computation and communication.
This paper introduces an ...
Also Published in:
ACM SIGPLAN Notices: Volume 44 Issue 4