More Web Proxy on the site http://driver.im/

research-article

Priority Queues and Sorting Methods for Parallel Simulation

Authors:

Miltos D. Grammatikakis,

Stefan LiescheAuthors Info & Claims

IEEE Transactions on Software Engineering, Volume 26, Issue 5

Pages 401 - 422

https://doi.org/10.1109/32.846298

Published: 01 May 2000 Publication History

Abstract

We examine the design, implementation, and experimental analysis of parallel priority queues for device and network simulation. We consider: 1) distributed splay trees using MPI, 2) concurrent heaps using shared memory atomic locks, and 3) a new, more general concurrent data structure based on distributed sorted lists, which is designed to provide dynamically balanced work allocation (with automatic or manual control) and efficient use of shared memory resources. We evaluate performance for all three data structures on a Cray-T3E900 system at KFA-Jülich. Our comparisons are based on simulations of single buffers and a $64 \times 64$ packet switch which supports multicasting. In all implementations, PEs monitor traffic at their preassigned input/output ports, while priority queue elements are distributed across the Cray-T3E virtual shared memory. Our experiments with up to 60,000 packets and two to 64 PEs indicate that concurrent priority queues perform much better than distributed ones. Both concurrent implementations have comparable performance, while our new data structure uses less memory and has been further optimized. We also consider parallel simulation for symmetric networks by sorting integer conflict functions and implementing an interesting packet indexing scheme. The optimized message passing network simulator can process $\sim 500$K packet moves in one second, with an efficiency that exceeds $\sim 50$ percent for a few thousands packets on the Cray-T3E with 32 PEs. All developed data structures now form a parallel library. Although our concurrent implementations use the Cray-T3E ShMem library, portability can be derived from Open-MP or MPI-2 standard libraries, which will provide support for one-way communication and shared memory lock mechanisms.

References

[1]

S. Arora F.T. Leighton and B.M. Maggs, “On-Line Algorithms for Path Selection in a Nonblocking Network,” SIAM J. Computing, vol. 25, no. 3, pp. 600–625, 1996.

Digital Library

[2]

J. Aspnes M. Herlihy and N. Shavit, “Counting Networks,” J. ACM, vol. 41, no. 5, pp. 1,020–1,048, 1994.

Digital Library

[3]

H. Attiya and R. Friedman., “Programming DEC-Alpha Based Multiprocessors the Easy Way,” Proc. Sixth ACM Symp. Parallel Algorithms and Architectures, pp. 157–166, 1990.

Digital Library

[4]

G.S. Brodal J.L. Traff and C.D. Zaroliagis, “A Parallel Priority Queue with Constant Time Operations,” J. Parallel and Distributed Computing, vol. C-49, no. 1, pp. 4–21, 1998.

Digital Library

[5]

J.R. Driscoll H.N. Gabow R. Shrairman and R.E. Tarjan, “An Alternative to Fibonacci Heaps with Applications to Parallel Computation. Comm.,” Comm. ACM, vol. 31, no. 11, pp. 1,343–1,354, 1988.

Digital Library

[6]

M.D. Grammatikakis N. Fideropoulos F. Howell S. Liesche T. Thielke and A. Zachos, “Network Simulation on the CM-5 by Sorting Integer Conflict Functions,” Proc. Parallel Computer Conf., 1997.

[7]

M.D. Grammatikakis N. Fideropoulos and A. Zachos., “Network Simulation on Cray-T3E Using MPI,” Proc. Third Cray-SGI MPP Conf., http://armoise.saclay.cea.fr/~workshop/Program.html, 1997.

[8]

M.D. Grammatikakis D.F. Hsu M. Kraetzl and J. Sibeyn, “Packet Routing in Fixed-Connection Networks: A Survey,” J. Parallel and Distributed Computing, pp. 77–132, 1998.

Digital Library

[9]

M.D. Grammatikakis and M. Johl., “Clock-Cycle Level Simulations of an ATM Switch,” Proc. First SCS Euro Media Conf., pp. 149–156, 1996.

[10]

M.D. Grammatikakis and S. Liesche., “Synchronization on Cray-T3E Virtual Shared Memory,” Proc. 40th Cray Users Group Conf., http://theoretica.informatik.uni-oldenburg.de/mdgramma/mutex.ps.gz, 1998.

[11]

D.R. Helman D. Bader and J. JáJá, “Parallel Algorithms for Personalized Communication and Sorting with an Experimental Study,” Proc. ACM Symp. Parallel Algorithms and Architectures, pp. 211–220, 1996.

Digital Library

[12]

M. Herlihy B.H. Lim and N. Shavit, “Scalable Concurrent Counting,” Proc. ACM Trans. Computing Systems, vol. C-13, no. 4, pp. 343–364, 1995.

Digital Library

[13]

M. Herlihy and J.E.B. Moss, “Transactional Memory: Architectural Support for Lock-Free Data Structures,” Proc. Int'l Symp. Computer Architecture, pp. 289–301, 1993.

Digital Library

[14]

F.W. Howell, “Reverse Profiling,” Proc. First Int'l Workshop Parallel and Distributed Software Eng., pp. 245–255, 1996.

Digital Library

[15]

G.C. Hunt M. Michael M.S. Parthasarathy and M.L. Scott, “An Efficient Algorithm for Concurrent Priority Queue Heaps,” Information Procedure Letters, vol. 60, no. 3 pp. 151–157, 1996.

Digital Library

[16]

D.W. Jones., “Concurrent Operations on Priority Queues,” Comm. ACM, vol. 32, no. 1, pp. 132–137, 1989.

Digital Library

[17]

D.E. Knuth., The Art of Computer Programming: Sorting and Searching. Addison Wesley, 1973.

Digital Library

[18]

F.T. Leighton D. Lisinski and B.M. Maggs, “Empirical Evaluation of Randomly-Wired Multistage Networks,” Proc. IEEE Int'l Conf. Computer Design, pp. 380–385, 1990.

[19]

B.M. Maggs, “Randomly-Wired Multistage Networks,” Statistical Science, vol. C-8, no. 1, pp. 70–75, 1993.

[20]

B. Mans, “Portable Distributed Priority Queues with MPI,” Concurrency: Practice and Experience, vol. 10, no. 3, pp. 175–198, 1998.

[21]

Z.G Mou and M. Goodman., “A Comparison of Communication Costs for Three Parallel Program Paradigms on Hypercube and Mesh Architectures,” Proc. Fifth SIAM Conf. Parallel Processing and Scientific Computing, pp. 491–500, 1992.

Digital Library

[22]

“MPI-2: Extensions to the Message-Passing Interface,” Proc. MPI Forum, July 1997.

[23]

S. Liesche., “MPI and Shared Memory Implementations of Priority Queues for Parallel Simulation on Cray-T3E,” Diplomarbeit, Germany: Inst. of Informatics, Univ. of Hildesheim, http://members.aol.com/liesche/darbeit.zip, May 1998.

[24]

J.M. Mellor-Crummey and M.L. Scott, “Algorithms for Scalable Synchronization on Shared Memory Multiprocessors,” Proc. ACM Trans. Computer Systems, vol. C-9, no. 1, pp. 21–65, 1991.

Digital Library

[25]

D. Peleg, “Distributed Data Structures: A Complexity-Oriented View,” Proc. Int'l Workshop Distr. Alg., vol. 486, pp. 71–89, 1991.

Digital Library

[26]

V.N. Rao and V. Kumar, “Concurrent Access of Priority Queues,” IEEE Trans. Computer, vol. 37, no. 12, pp. 1,657–1,665, Dec. 1988.

Digital Library

[27]

P. Sanders, “Fast Priority Queues for Parallel Branch-and-Bound,” Proc. Int'l Workshop Algorithm Irregularities and Structural Problems, vol. 980, pp. 379–393, 1995.

Digital Library

[28]

P. Sanders, “Randomized Priority Queues for Fast Parallel Access,” J. Parallel Distributed and Computing, vol. C-49, no. 1, pp. 86–97, 1998.

Digital Library

[29]

S. Savage M. Burrows G. Nelson P. Sobalvarro and T. Anderson, “Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs,” Proc. 16th ACM Symp. OS Principles, France: Saint-Malo, pp. 26–37, 1997.

Digital Library

[30]

D. Sleator and R.E. Tarjan, “Self-Adjusting Binary Search Trees,” SIAM J. Computing, vol. 15, no. 1, pp. 52–69, 1986.

Digital Library

[31]

R. Subramanian and I.D. Scherson, “An Analysis of Diffusive Load-Balancing,” Proc. ACM Symp. Parallel and Algorithms and Architectures, pp. 220–225, 1994.

Digital Library

[32]

J. Turner and N. Yamanaka, “Architectural Choices in Large Scale ATM Switches,” Trans. Inst. Electronics, Information and Communication Eng., 1998.

[33]

J.W. Williams, “Algorithm 232: Heapsort,” Comm. ACM, vol. 7, pp. 347–348, 1964.

Digital Library

Cited By

Park HFishwick P(2010)A GPU-Based Application Framework Supporting Fast Discrete-Event SimulationSimulation10.1177/003754970934078186:10(613-628)Online publication date: 1-Oct-2010
https://dl.acm.org/doi/10.1177/0037549709340781
Shrikumar HGielen G(2006)40Gbps de-layered silicon protocol engine for TCP recordProceedings of the conference on Design, automation and test in Europe: Proceedings10.5555/1131481.1131537(188-193)Online publication date: 6-Mar-2006
https://dl.acm.org/doi/10.5555/1131481.1131537
Sundell HTsigas P(2005)Fast and lock-free concurrent priority queues for multi-thread systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2004.12.00565:5(609-627)Online publication date: 1-May-2005
https://dl.acm.org/doi/10.1016/j.jpdc.2004.12.005
Show More Cited By

Index Terms

Priority Queues and Sorting Methods for Parallel Simulation

Recommendations

MultiQueues: Simple Relaxed Concurrent Priority Queues
SPAA '15: Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures

We present a simple, concurrent data structure that approximates the behavior of a priority queue and that gives very good performance guarantees. We also discuss models for the semantics of relaxed priority queues and introduce a technique for "...
Waiting time and queue length analysis of Markov-modulated fluid priority queues
Abstract
This paper considers a multi-type fluid queue with priority service. The input fluid rates are modulated by a Markov chain, which is common for all fluid types. The service rate of the queue is constant. Various performance measures are derived, ...
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
IPDPS '03: Proceedings of the 17th International Symposium on Parallel and Distributed Processing

We present an efficient and practical lock-free implementation of a concurrent priority queue that is suitable for both fully concurrent (large multi-processor) systems as well as pre-emptive (multi-process) systems. Many algorithms for concurrent ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering

IEEE Transactions on Software Engineering Volume 26, Issue 5

May 2000

94 pages

ISSN:0098-5589

Editor:
Anneliese von Mayrhauser

Issue’s Table of Contents

Copyright © Copyright © 2000 IEEE. All Rights Reserved.

Publisher

IEEE Press

Publication History

Published: 01 May 2000

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Park HFishwick P(2010)A GPU-Based Application Framework Supporting Fast Discrete-Event SimulationSimulation10.1177/003754970934078186:10(613-628)Online publication date: 1-Oct-2010
https://dl.acm.org/doi/10.1177/0037549709340781
Shrikumar HGielen G(2006)40Gbps de-layered silicon protocol engine for TCP recordProceedings of the conference on Design, automation and test in Europe: Proceedings10.5555/1131481.1131537(188-193)Online publication date: 6-Mar-2006
https://dl.acm.org/doi/10.5555/1131481.1131537
Sundell HTsigas P(2005)Fast and lock-free concurrent priority queues for multi-thread systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2004.12.00565:5(609-627)Online publication date: 1-May-2005
https://dl.acm.org/doi/10.1016/j.jpdc.2004.12.005
Grammatikakis MCoppola MSensini F(2003)Software for multiprocessor networks on chipNetworks on chip10.5555/903951.903966(281-303)Online publication date: 1-Jan-2003
https://dl.acm.org/doi/10.5555/903951.903966

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents