More Web Proxy on the site http://driver.im/

research-article

Topology-induced Enhancement of Mappings

Authors:

Henning MeyerhenkeAuthors Info & Claims

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

Article No.: 9, Pages 1 - 10

https://doi.org/10.1145/3225058.3225117

Published: 13 August 2018 Publication History

Abstract

In this paper we propose a new method to enhance a mapping μ(·) of a parallel application's computational tasks to the processing elements (PEs) of a parallel computer. The idea behind our method TiMEr is to enhance such a mapping by drawing on the observation that many topologies take the form of a partial cube. This class of graphs includes all rectangular and cubic meshes, any such torus with even extensions in each dimension, all hypercubes, and all trees.

Following previous work, we represent the parallel application and the parallel computer by graphs Ga and Gp, respectively. Gp being a partial cube allows us to label its vertices, the PEs, by bitvectors such that the cost of exchanging one unit of information between any two vertices of Gp amounts to the Hamming distance between their labels. By transferring these bitvectors to the vertex set of Ga via μ-1(·) and extending them to be unique on Ga, we can enhance μ(·) by swapping labels on Ga in a new way. Pairs of swapped labels are local w. r. t. the PEs, but not w. r. t. Ga. Moreover, permutations of the bitvectors' entries give rise to a plethora of hierarchies on the PEs. Through these hierarchies we turn TiMEr into a hierarchical method for improving μ(·) that is complementary to state-of-the-art methods for computing μ(·) in the first place.

In our experiments we use TiMEr to enhance mappings of complex networks onto rectangular meshes and tori with 256 and 512 nodes, as well as hypercubes with 256 nodes. It turns out that common quality measure of mappings derived from state-of-the-art tools, such as Scotch and KaHIP, can be improved up to 34%.

References

[1]

Eric Aubanel. 2009. Resource-Aware Load Balancing of Parallel Applications. In Handbook of Research on Grid Technologies and Utility Computing: Concepts for Managing Large-Scale Applications, Emmanuel Udoh and Frank Zhigang Wang (Eds.). Information Science Reference - Imprint of: IGI Publishing, 12--21.

[2]

C. Bichot and P. Siarry (Eds.). 2011. Graph Partitioning. Wiley.

[3]

B. Brandfass, T. Alrutz, and T. Gerhold. 2013. Rank Reordering for MPI Communication Optimization. Computers & Fluids 80, 0 (2013), 372--380.

[4]

Aydin Buluç, Henning Meyerhenke, Ilya Safro, Peter Sanders, and Christian Schulz. 2016. Recent Advances in Graph Partitioning. In Algorithm Engineering -Selected Results and Surveys, Lasse Kliemann and Peter Sanders (Eds.). Lecture Notes in Computer Science, Vol. 9220. 117--158.

[5]

Siew Yin Chan, Teck Chaw Ling, and Eric Aubanel. 2012. The Impact of Heterogeneous Multi-Core Clusters on Graph Partitioning: An Empirical Study. Cluster Computing 15, 3 (2012), 281--302.

Digital Library

[6]

Cédric Chevalier and Ilya Safro. 2009. Comparison of Coarsening Schemes for Multilevel Graph Partitioning. In Learning and Intelligent Optimization, Third International Conference, LION 3, Trento, Italy, January 14-18, 2009. Selected Papers (Lecture Notes in Computer Science), Thomas Stützle (Ed.), Vol. 5851. Springer, 191--205.

Digital Library

[7]

C. Clauss, S. Lankes, P. Reble, and T. Bemmerl. 2011. Evaluation and improvements of programming models for the Intel SCC many-core processor. In 2011 International Conference on High Performance Computing Simulation. 525--532.

[8]

M. Deveci, K. Kaya, B. Uçar, and U. V. Catalyurek. 2015. Fast and High Quality Topology-Aware Task Mapping. In 2015 IEEE International Parallel and Distributed Processing Symposium. 197--206.

Digital Library

[9]

D. Eppstein. 2011. Recognizing Partial Cubes in Quadratic Time. J. Graph Algorithms Appl. 15, 2 (2011), 269--293.

[10]

Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co.

Digital Library

[11]

R. Glantz, H. Meyerhenke, and A. Noe. 2015. Algorithms for Mapping Parallel Processes onto Grid and Torus Architectures. In 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2015, Turku, Finland. 236--243.

Digital Library

[12]

T. Hoefler and M. Snir. 2011. Generic Topology Mapping Strategies for Large-scale Parallel Architectures. In ACM International Conference on Supercomputing (ICS'11). ACM, 75--85.

Digital Library

[13]

W. Imrich. 1993. A simple O(mn) algorithm for recognizing Hamming graphs. Bull. Inst. Comb. Appl (1993), 45--56.

[14]

E. Jeannot, G. Mercier, and F. Tessier. 2013. Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques. IEEE Transactions on Parallel and Distributed Systems PP, 99 (2013), 1--1.

Digital Library

[15]

G. Karypis and V. Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20, 1 (1998), 359--392.

Digital Library

[16]

B. W. Kernighan and S. Lin. 1970. An efficient heuristic procedure for partitioning graphs. Bell Systems Technical Journal 49 (1970), 291--307.

[17]

Young Man Kim and Ten-Hwang Lai. 1991. The Complexity of Congestion-1 Embedding in a Hypercube. Journal of Algorithms 12, 2 (1991), 246--280.

Digital Library

[18]

Shad Kirmani, JeongHyung Park, and Padma Raghavan. 2017. An embedded sectioning scheme for multiprocessor topology-aware mapping of irregular applications. IJHPCA 31, 1 (2017), 91--103.

Digital Library

[19]

Henning Meyerhenke, Burkhard Monien, and Thomas Sauerwald. 2009. A New Diffusion-based Multilevel Algorithm for Computing Graph Partitions. J. Parallel and Distributed Computing 69, 9 (2009), 750--761.

Digital Library

[20]

Vitaly Osipov and Peter Sanders. 2010. n-Level Graph Partitioning. In Algorithms - ESA 2010, 18th Annual European Symposium, Liverpool, UK, September 6-8, 2010. Proceedings, Part I (Lecture Notes in Computer Science), Mark de Berg and Ulrich Meyer (Eds.), Vol. 6346. Springer, 278--289.

Digital Library

[21]

S. Ovchinnikov. 2008. Partial cubes: Structures, characterizations, and constructions. Discrete Mathematics 308 (2008), 5597--5621.

Digital Library

[22]

François Pellegrini. 1994. Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs. In Scalable High-Performance Computing Conference (SHPCC). IEEE, 486--493.

[23]

François Pellegrini. 2007. Scotch and libScotch 5.0 User's Guide. Technical Report. LaBRI, Université Bordeaux I.

[24]

François Pellegrini. 2011. Static Mapping of Process Graphs. In Graph Partitioning, Charles-Edmond Bichot and Patrick Siarry (Eds.). John Wiley & Sons, Chapter 5, 115--136.

[25]

Xinyu Que, Fabio Checconi, Fabrizio Petrini, and John A. Gunnels. 2015. Scalable Community Detection with the Louvain Algorithm. In 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, May 25-29, 2015. IEEE Computer Society, 28--37.

Digital Library

[26]

I. Safro, P. Sanders, and Ch. Schulz. 2012. Advanced Coarsening Schemes for Graph Partitioning. In Proc. 11th Int. Symp. on Experimental Algorithms. Springer, 369--380.

Digital Library

[27]

P. Sanders and C. Schulz. 2013. High Quality Graph Partitioning. In Proc. of the 10th DIMACS Impl. Challenge Workshop: Graph Partitioning and Graph Clustering. AMS, 1--17.

[28]

Peter Sanders and Christian Schulz. 2013. KaHIP v0.53 - Karlsruhe High Quality Partitioning - User Guide. CoRR abs/1311.1714 (2013).

[29]

C. Schulz and J. L. Träff. 2017. Better Process Mapping and Sparse Quadratic Assignment. CoRR abs/1702.04164 (2017).

[30]

Roman Trobec and Gregor Kosec. 2015. Parallel Scientific Computing. Theory, Algorithms, and Applications of Mesh Based and Meshless Methods. Springer Intl. Publ.

Digital Library

[31]

Bora Ucar, Cevdet Aykanat, Kamer Kaya, and Murat Ikinci. 2006. Task Assignment in Heterogeneous Computing Systems. J. Parallel and Distrib. Comput. 66, 1 (2006), 32--46.

Digital Library

[32]

Chris Walshaw and Mark Cross. 2001. Multilevel Mesh Partitioning for Heterogeneous Communication Networks. Future Generation Comp. Syst. 17, 5 (2001), 601--623.

[33]

Hao Yu, I-Hsin Chung, and Jose Moreira. 2006. Topology Mapping for Blue Gene/L Supercomputer. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC '06). ACM, New York, NY, USA.

Digital Library

Cited By

Predari MTzovas CSchulz CMeyerhenke H(2021)An MPI-based Algorithm for Mapping Complex Networks onto Hierarchical ArchitecturesEuro-Par 2021: Parallel Processing10.1007/978-3-030-85665-6_11(167-182)Online publication date: 1-Sep-2021
https://dl.acm.org/doi/10.1007/978-3-030-85665-6_11
Tzovas CPredari MMeyerhenke H(2020)Distributing Sparse Matrix/Graph Applications in Heterogeneous Clusters - an Experimental Study2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC50609.2020.00021(72-81)Online publication date: Dec-2020
https://doi.org/10.1109/HiPC50609.2020.00021

Index Terms

Topology-induced Enhancement of Mappings
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms

Recommendations

Induced cycles in crossing graphs of median graphs

The crossing graph G^# of a partial cube G has the equivalence classes of the Djokovic-Winkler relation @Q as vertices, two @Q-classes being adjacent if they appear on some common isometric cycle. The following question [S. Klavzar, H.M. Mulder, Partial ...
An Edit Distance Between Graph Correspondences
Graph-Based Representations in Pattern Recognition
Abstract
The Hamming Distance has been largely used to calculate the dissimilarity of a pair of correspondences (also known as labellings or matchings) between two structures (i.e. sets of points, strings or graphs). Although it has the advantage of being ...
A Generalized Upper Bound and a Multilevel Construction for Distance-Preserving Mappings

A new general upper bound is derived on the sum of the Hamming distances between sequences when mapping from one set of sequences to another. It is shown that a similar upper bound for mappings from binary sequences to permutation sequences is a special ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

August 2018

945 pages

ISBN:9781450365109

DOI:10.1145/3225058

Copyright © 2018 ACM.

© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

University of Oregon: University of Oregon

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICPP 2018

ICPP 2018: 47th International Conference on Parallel Processing

August 13 - 16, 2018

OR, Eugene, USA

Acceptance Rates

ICPP '18 Paper Acceptance Rate 91 of 313 submissions, 29%;

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
92
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Predari MTzovas CSchulz CMeyerhenke H(2021)An MPI-based Algorithm for Mapping Complex Networks onto Hierarchical ArchitecturesEuro-Par 2021: Parallel Processing10.1007/978-3-030-85665-6_11(167-182)Online publication date: 1-Sep-2021
https://dl.acm.org/doi/10.1007/978-3-030-85665-6_11
Tzovas CPredari MMeyerhenke H(2020)Distributing Sparse Matrix/Graph Applications in Heterogeneous Clusters - an Experimental Study2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC50609.2020.00021(72-81)Online publication date: Dec-2020
https://doi.org/10.1109/HiPC50609.2020.00021

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents