Addressing communication latency issues on clusters for fine grained asynchronous applications—A case study

Umesh Kumar V. Rajasekaran¹,
Malolan Chetlur¹,
Girindra D. Sharma¹,
Radharamanan Radhakrishnan¹ &
…
Philip A. Wilsey¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1586))

Included in the following conference series:

International Parallel Processing Symposium

120 Accesses

Abstract

With the advent of cheap and powerful hardware for workstations and networks, a new cluster-based architecture for parallel processing applications has been envisioned. However, fine-grained asynchronous applications that communicate frequently are not the ideal candidates for such architectures because of their high latency communication costs. Hence, designers of fine-grained parallel applications on clusters are faced with the problem of reducing the high communication latency in such architectures. Depending on what kind of resources are available, the communication latency can be improved along the following dimensions: (a) reducing network latency by employing a higher performance network hardware (i.e., Fast Ethernet versus Myrinet); (b) reducing communication software overhead by developing more efficient communication libraries (MPICH versus TCPMPL (our TCP/IP based message passing layer) versus MPI-BIP); (c) rewriting/restructuring the application code for less frequent communication; and (d) exploiting application characteristics by deploying communication optimizations that exploit the application’s inherent communication characteristics. This paper discusses our experiences with building a communication subsystem on a cluster of workstations for a fine-grained asynchronous application (a Time Warp synchronized discrete-event simulator). Specifically, our efforts in reducing the communication latency along three of the four aforementioned dimensions are detailed and discussed. In addition, performance results from an in-depth empirical evaluation of the communication subsystem are reported in the paper.

Support for this work was provided in part by the Advanced Research Projects Agency under contracts J-FBI-93-116 and DABT63-96-C-0055.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boden, N. J., Cohen, D., Felderman, R. E., Kulawik, A. E., Seitz, C. L., Seizovic, J. N., and Su, W.-K. Myrinet—a gigabit-per-second local-area network. IEEE Micro 15, 1 (February 1995), 29–36.
Article Google Scholar
Chen, P. M. RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26, 2 (June 1994), 145–185.
Article Google Scholar
Chetlur, M., Abu-Ghazaleh, N., Radhakrishnan, R., and Wilsey, P. A. Optimizing communication in Time-Warp simulators. In 12th Workshop on Parallel and Distributed Simulation (May 1998), Society for Computer Simulation, pp. 64–71.
Google Scholar
Ciaccio, G. Optimal communication performance on fast ethernet with gamma. In 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing (Orlando, Florida, March/April 1998), Springer, pp. 534–548.
Google Scholar
Felten, E. W. Protocol compilation: High-performance communication for parallel programs. Tech. rep., University of Washington—Dept. of Computer Science, 1993.
Google Scholar
Fujimoto, R. Parallel discrete event simulation. Communications of the ACM 33, 10 (Oct. 1990), 30–53.
Article Google Scholar
Fujimoto, R. Performance of time warp under synthetic workloads. Proceedings of the SCS Multiconference on Distributed Simulation 22, 1 (Jan.1990), 23–28.
Google Scholar
Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., and Snir, M.MPI: The Complete Reference Volume 2—The MPI-2 Extension. MIT Press, 1998.
Google Scholar
Gropp, W., Lusk, E., Doss, N. and Skjellum, A.A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard, July 1996.
Google Scholar
Jefferson, D. Virtual time. ACM Transactions on Programming Languages and Systems 7, 3 (July 1985), 405–425.
Article MathSciNet Google Scholar
Lab, S. C. Scl cluster cookbook—technology comparison. (available on the www at http://www.scl.ameslab.gov/Projects/ClusterCookbook/icperf.html).
Google Scholar
Marenzoni, P., Rimassa, G., Vignali, M., Bertozzi, M., Conte, G., and Rossi, P. An operating system support to low-overhead communications in NOW clusters. In Proceedings of Communication and Architectural Support for Net work-Based Parallel Computing CANPC97 (San Antonio, Texas, Feb. 1997), vol. 1199, Springer-Verlag, pp. 130–143.
Google Scholar
Misra, J. Distributed discrete-event simulation. Computing Surveys 18, 1 (Mar. 1986), 39–65.
Article Google Scholar
Nagle, J. Congestion control in TCP/IP internetworks. Computer Communications Review 14 (Oct 1984), 11–17.
Google Scholar
Pakin, S., Lauria, M., and Chien, A. High performance messaging on workstations: Illinois fast message (FM) for Myrinet. In Proceedings of Supercomputing ’95 (December 1995).
Google Scholar
Prylli, L., and Tourancheau, B. BIP: A new protocol designed for high performance networking on myrinet. In 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing (Orlando, Florida, March/April 1998), Springer, pp. 472–485.
Google Scholar
Radhakrishnan, R., Martin, D. E., Chetlur, M., Rao, D. M., and Wilsey, P. A. An Object-Oriented Time Warp Simulation Kernel. In Proceedings of the International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE’98), D. Caromel, R. R. Oldehoeft, and M. Tholburn, Eds., vol. LNCS 1505. Springer-Verlag, Dec. 1998, pp. 13–23.
Google Scholar
Stevens, W. R.TCP/IP Illustrated Volume 1: The Protocols. Addison-Wesley Publishing Company, Reading Massachusetts, March 1996.
Google Scholar
von Eicken, T., Basu, A., Buch, V., and Vogels, W. U-Net: A user-level network interface for parallel and distributed computing. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (December 3–6 1995).
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Architecture Design Laboratory, Dept. of ECECS, PO Box 210030, 45221-0030, Cincinnati, OH
Umesh Kumar V. Rajasekaran, Malolan Chetlur, Girindra D. Sharma, Radharamanan Radhakrishnan & Philip A. Wilsey

Authors

Umesh Kumar V. Rajasekaran
View author publications
You can also search for this author in PubMed Google Scholar
Malolan Chetlur
View author publications
You can also search for this author in PubMed Google Scholar
Girindra D. Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Radharamanan Radhakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Philip A. Wilsey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Rolim Frank Mueller Albert Y. Zomaya Fikret Ercal Stephan Olariu Binoy Ravindran Jan Gustafsson Hiroaki Takada Ron Olsson Laxmikant V. Kale Pete Beckman Matthew Haines Hossam ElGindy Denis Caromel Serge Chaumette Geoffrey Fox Yi Pan Keqin Li Tao Yang G. Chiola G. Conte L. V. Mancini Domenique Méry Beverly Sanders Devesh Bhatt Viktor Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rajasekaran, U.K.V., Chetlur, M., Sharma, G.D., Radhakrishnan, R., Wilsey, P.A. (1999). Addressing communication latency issues on clusters for fine grained asynchronous applications—A case study. In: Rolim, J., et al. Parallel and Distributed Processing. IPPS 1999. Lecture Notes in Computer Science, vol 1586. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0097999

Download citation

DOI: https://doi.org/10.1007/BFb0097999
Published: 28 October 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65831-3
Online ISBN: 978-3-540-48932-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics