Abstract
With the advent of cheap and powerful hardware for workstations and networks, a new cluster-based architecture for parallel processing applications has been envisioned. However, fine-grained asynchronous applications that communicate frequently are not the ideal candidates for such architectures because of their high latency communication costs. Hence, designers of fine-grained parallel applications on clusters are faced with the problem of reducing the high communication latency in such architectures. Depending on what kind of resources are available, the communication latency can be improved along the following dimensions: (a) reducing network latency by employing a higher performance network hardware (i.e., Fast Ethernet versus Myrinet); (b) reducing communication software overhead by developing more efficient communication libraries (MPICH versus TCPMPL (our TCP/IP based message passing layer) versus MPI-BIP); (c) rewriting/restructuring the application code for less frequent communication; and (d) exploiting application characteristics by deploying communication optimizations that exploit the application’s inherent communication characteristics. This paper discusses our experiences with building a communication subsystem on a cluster of workstations for a fine-grained asynchronous application (a Time Warp synchronized discrete-event simulator). Specifically, our efforts in reducing the communication latency along three of the four aforementioned dimensions are detailed and discussed. In addition, performance results from an in-depth empirical evaluation of the communication subsystem are reported in the paper.
Support for this work was provided in part by the Advanced Research Projects Agency under contracts J-FBI-93-116 and DABT63-96-C-0055.
Preview
Unable to display preview. Download preview PDF.
References
Boden, N. J., Cohen, D., Felderman, R. E., Kulawik, A. E., Seitz, C. L., Seizovic, J. N., and Su, W.-K. Myrinet—a gigabit-per-second local-area network. IEEE Micro 15, 1 (February 1995), 29–36.
Chen, P. M. RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26, 2 (June 1994), 145–185.
Chetlur, M., Abu-Ghazaleh, N., Radhakrishnan, R., and Wilsey, P. A. Optimizing communication in Time-Warp simulators. In 12th Workshop on Parallel and Distributed Simulation (May 1998), Society for Computer Simulation, pp. 64–71.
Ciaccio, G. Optimal communication performance on fast ethernet with gamma. In 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing (Orlando, Florida, March/April 1998), Springer, pp. 534–548.
Felten, E. W. Protocol compilation: High-performance communication for parallel programs. Tech. rep., University of Washington—Dept. of Computer Science, 1993.
Fujimoto, R. Parallel discrete event simulation. Communications of the ACM 33, 10 (Oct. 1990), 30–53.
Fujimoto, R. Performance of time warp under synthetic workloads. Proceedings of the SCS Multiconference on Distributed Simulation 22, 1 (Jan.1990), 23–28.
Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., and Snir, M.MPI: The Complete Reference Volume 2—The MPI-2 Extension. MIT Press, 1998.
Gropp, W., Lusk, E., Doss, N. and Skjellum, A.A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard, July 1996.
Jefferson, D. Virtual time. ACM Transactions on Programming Languages and Systems 7, 3 (July 1985), 405–425.
Lab, S. C. Scl cluster cookbook—technology comparison. (available on the www at http://www.scl.ameslab.gov/Projects/ClusterCookbook/icperf.html).
Marenzoni, P., Rimassa, G., Vignali, M., Bertozzi, M., Conte, G., and Rossi, P. An operating system support to low-overhead communications in NOW clusters. In Proceedings of Communication and Architectural Support for Net work-Based Parallel Computing CANPC97 (San Antonio, Texas, Feb. 1997), vol. 1199, Springer-Verlag, pp. 130–143.
Misra, J. Distributed discrete-event simulation. Computing Surveys 18, 1 (Mar. 1986), 39–65.
Nagle, J. Congestion control in TCP/IP internetworks. Computer Communications Review 14 (Oct 1984), 11–17.
Pakin, S., Lauria, M., and Chien, A. High performance messaging on workstations: Illinois fast message (FM) for Myrinet. In Proceedings of Supercomputing ’95 (December 1995).
Prylli, L., and Tourancheau, B. BIP: A new protocol designed for high performance networking on myrinet. In 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing (Orlando, Florida, March/April 1998), Springer, pp. 472–485.
Radhakrishnan, R., Martin, D. E., Chetlur, M., Rao, D. M., and Wilsey, P. A. An Object-Oriented Time Warp Simulation Kernel. In Proceedings of the International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE’98), D. Caromel, R. R. Oldehoeft, and M. Tholburn, Eds., vol. LNCS 1505. Springer-Verlag, Dec. 1998, pp. 13–23.
Stevens, W. R.TCP/IP Illustrated Volume 1: The Protocols. Addison-Wesley Publishing Company, Reading Massachusetts, March 1996.
von Eicken, T., Basu, A., Buch, V., and Vogels, W. U-Net: A user-level network interface for parallel and distributed computing. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (December 3–6 1995).
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Rajasekaran, U.K.V., Chetlur, M., Sharma, G.D., Radhakrishnan, R., Wilsey, P.A. (1999). Addressing communication latency issues on clusters for fine grained asynchronous applications—A case study. In: Rolim, J., et al. Parallel and Distributed Processing. IPPS 1999. Lecture Notes in Computer Science, vol 1586. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0097999
Download citation
DOI: https://doi.org/10.1007/BFb0097999
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65831-3
Online ISBN: 978-3-540-48932-0
eBook Packages: Springer Book Archive