[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Interleaved parallel schemes: improving memory throughput on supercomputers

Published: 01 April 1992 Publication History

Abstract

On many commercial supercomputers, several vector register processors share a global highly interleaved memory in a MIMD mode. When all the processors are working on a single vector loop, a significant part of the potential memory throughput may be wasted due to the asynchronism of the processors.
In order to limit this loss of memory throughput, a SIMD synchronization mode for vector accesses to memory may be used. But an important part of the memory bandwidth may be wasted when accessing vectors with an even stride.
In this paper, we present IPS, an interleaved parallel scheme, which ensures an equitable distribution of elements on a highly interleaved memory for a wide range a vector strides. We show how to organize access to memory, such that unscrambling of vectors from memory to the vector register processors requires a minimum number of passes through the interconnection network.

References

[1]
V.E. Benes, "Mathematical Theory of connecting networks and telephone traffic", New York: Academic, 1968
[2]
P.Budnick, D.Kuck "The organization and use of parallel memories" IEEE Transaction On Computers, Dec. 1971
[3]
J.M. Fr~ilong, W.JMby, J.Lenf~nt "XOR-schemes: a flexible organization in parallel memories" Proceedings of 1985 International Conference on Parallel Processing, Aug. 1985
[4]
D.J.Kuck, R.A.Stokes, "The Burroughs Scientific Processor (BSP)" IEEE Transactions on Computers, May 1982.
[5]
D.T. Harper, J.R. Jump "Performance evaluation of vector accesses in parallel memories using a skewed storage scheme ", Proceedings of the 13th International Symposium on Computer Architecture, June 1986
[6]
D.T. Harper, J.R. Jump "Vector accesses in parallel memories using a skewed storage scheme " IEEE Transactions on Computers, Dec. 1987
[7]
D.H. Lawrie "Access and Mignment of data in array computer" IEEE Transactions on Computers, Dec. 1975
[8]
K.Y. Lee, "On the rearrangeability of a (2 log N - 1) stage permutation network" tEEE Transactions on Computers, May 1985.
[9]
J.Lenfant, "Parallel permutations of data : A Benes network control algorithm for frequently used permutations" IEEE Transactions on Computers, July 1978.
[10]
J.Lenfant, " A versatile mechanism to move data in an array processor" IEEE Transactions on Computers, June 1985
[11]
D.Nassimi, S.Sahni "A self-routing Benes network and permutation algorithms" IEEE Tlansactions on Computers, May 1981.
[12]
A.Norton, E.Melton "A class of boolean linear transformations for conflict-free power-of-two stride access", Proceedings of the International Conference on Parallel Processing, 1987
[13]
B.Rau, M.Schlander, D. Yen " TILe Cydra 5 stride insensitive memory system", Proceedings of the International Conference on Parallel Processing, 1989
[14]
A.Seznec, "An efficent routing control unit for the Sigma network E(4)'', Proceedings of the 13th International Symposium on Computer Architecture, June 1986
[15]
A.Seznec, "A new interconnection network for SIMD computers: the Sigma network Y~(~)" IEEE Transactions on Computers, July 1987
[16]
H.Tamura, Y.ShinkM, F.Isobe "The Supercomputer FACOM VP system" Fujitsu Sc. Tech. J., March 1985.

Cited By

View all
  • (1995)Semi-linear and bi-base storage schemes classesProceedings of the 9th international conference on Supercomputing10.1145/224538.224574(299-307)Online publication date: 3-Jul-1995
  • (2012)Isomorphic Recursive SplittingProceedings of the 2012 41st International Conference on Parallel Processing Workshops10.1109/ICPPW.2012.78(574-580)Online publication date: 10-Sep-2012
  • (2012)Memory AffinityProceedings of the 2012 IEEE International Conference on Cluster Computing10.1109/CLUSTER.2012.33(605-609)Online publication date: 24-Sep-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 20, Issue 2
Special Issue: Proceedings of the 19th annual international symposium on Computer architecture (ISCA '92)
May 1992
429 pages
ISSN:0163-5964
DOI:10.1145/146628
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture
    May 1992
    439 pages
    ISBN:0897915097
    DOI:10.1145/139669

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1992
Published in SIGARCH Volume 20, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)7
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (1995)Semi-linear and bi-base storage schemes classesProceedings of the 9th international conference on Supercomputing10.1145/224538.224574(299-307)Online publication date: 3-Jul-1995
  • (2012)Isomorphic Recursive SplittingProceedings of the 2012 41st International Conference on Parallel Processing Workshops10.1109/ICPPW.2012.78(574-580)Online publication date: 10-Sep-2012
  • (2012)Memory AffinityProceedings of the 2012 IEEE International Conference on Cluster Computing10.1109/CLUSTER.2012.33(605-609)Online publication date: 24-Sep-2012
  • (2009)PPTProceedings of the 2009 ACM/IEEE international symposium on Low power electronics and design10.1145/1594233.1594255(93-98)Online publication date: 19-Aug-2009
  • (2008)High-bandwidth Address Generation UnitJournal of Signal Processing Systems10.1007/s11265-008-0174-x57:1(33-44)Online publication date: 19-Jun-2008
  • (2005)Conflict-Free Accesses to Strided Vectors on a Banked CacheIEEE Transactions on Computers10.1109/TC.2005.11054:7(913-916)Online publication date: 1-Jul-2005
  • (2005)Memory access synchronization in vector multiprocessorsParallel Processing: CONPAR 94 — VAPP VI10.1007/3-540-58430-7_37(414-425)Online publication date: 3-Jun-2005
  • (2000)A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data localityProceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture10.1145/360128.360134(32-41)Online publication date: 1-Dec-2000
  • (1996)Increasing the effective bandwidth of complex memory systems in multivector processorsProceedings of the 1996 ACM/IEEE conference on Supercomputing10.1145/369028.369084(26-es)Online publication date: 17-Nov-1996
  • (1995)Vector multiprocessors with arbitrated memory accessACM SIGARCH Computer Architecture News10.1145/225830.22443523:2(243-252)Online publication date: 1-May-1995
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media