Abstract
This paper presents a comprehensive story of the development of simpler performance models for distributed implementations of the Fast Fourier Transform in 3 Dimensions (FFT3D). We start by providing an overview of several implementations and their performance models. Then, we present arguments to support the use of a simple power function instead of the full performance models proposed by other publications. We argue that our model can be obtained for a particular problem size with minimal experimentation while other models require significant tuning to determine their constants.
Our advocacy for simpler performance models is inspired by the difficulties found when estimating the performance of FFT3D programs. Correctly estimating how well large-scale programs (such as FFT3D) will work is one of the most challenging problems faced by scientists. The significant effort devoted to this problem has resulted in the appearance of numerous works on performance modeling.
The results produced by an exhaustive performance modeling study may predict the performance of a program with a reasonably good accuracy. However, those studies may become unusable because their aim for accuracy can make them so difficult and cumbersome to use that direct experimentation with the program may be preferable, defeating their original purpose. We propose an alternative approach in this paper that does not require a full, accurate, performance model. Our approach mitigates the problem of existing performance models, each one of the parameters and constants in the model has to be carefully measured and tuned, a process that is intrinsically harder than direct experimentation with the program at hand.
Instead, we were able to simplify our approach by (1) building performance models that target particular applications in their normal operating conditions and (2) using simpler models that still produce good approximations for the particular case of a program’s normal operating environment.
We have conducted experiments using the Blue Fire Supercomputer at the National Center for Atmospheric Research (NCAR), showing that our simplified model can predict the performance of a particular implementation with a high degree of accuracy and very little effort when the program is used in its intended operating range.
Finally, although our performance model does not cover extreme cases, we show that its simple approximation under the normal operating conditions of FFT3D is able to provide solid, useful approximations.
Chapter PDF
Similar content being viewed by others
Keywords
- Performance Model
- Domain Decomposition
- Normal Operating Condition
- Fast Fourier Transform Algorithm
- Actual Experimental Data
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex fourier series. Mathematics of Computation 19(90), 297–301 (1965), http://www.jstor.org/stable/2003354
Ayala, O., Grabowski, W.W., Wang, L.P.: A hybrid approach for simulating turbulent collisions of hydrodynamically-interacting particles. J. Comput. Phys. 225(1), 51–73 (2007), http://dx.doi.org/10.1016/j.jcp.2006.11.016
Laasonen, K., Pasquarello, A., Car, R., Lee, C., Vanderbilt, D.: Car-parrinello molecular dynamics with van-derbilt ultrasoft pseudopotentials. Physical Review B 47(16), 10142 (1993)
Bylaska, E.J., Valiev, M., Kawai, R., Weare, J.H.: Parallel implementation of the projector augmented plane wave method for charged systems. Computer Physics Communications 143(1), 11–28 (2002), http://www.sciencedirect.com/science/article/pii/S0010465501004131
Calandra, H., Bothorel, F., Vezolle, P.: A massively parallel implementation of the common azimuth pre-stack depth migration. IBM J. Res. Dev. 52(1/2), 83–91 (2008), http://dl.acm.org/citation.cfm?id=1375990.1375998
Stellmach, S., Hansen, U.: An efficient spectral method for the simulation of dynamos in Cartesian geometry and its implementation on massively parallel computers. Geochemistry, Geophysics, Geosystems 9, 5003 (2008)
Wang, L., Ayala, O., Parishani, H., Grabowski, W., Wyszogrodzki, A., Piotrowski, Z., Gao, G., Kambhamettu, C., Li, X., Rossi, L., et al.: Towards an integrated multiscale simulation of turbulent clouds on petascale computers. Journal of Physics: Conference Series 318, 072021 (2011)
Dmitruk, P., Wang, L.P., Matthaeus, W., Zhang, R., Seckel, D.: Scalable parallel fft for spectral simulations on a beowulf cluster. Parallel Computing 27(14), 1921–1936 (2001), http://www.sciencedirect.com/science/article/pii/S016781910100120X
Li, N., Laizet, S.: 2decomp fft a highly scalable 2d decomposition library and fft interface. Cray User Group 2010 (2010)
Pekurovsky, D.: Ultrascalable fourier transfroms in three dimensions. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, TG 2011, pp. 9:1–9:2. ACM, New York (2011), http://doi.acm.org/10.1145/2016741.2016751
Ayala, O., Wang, L.P.: Parallel implementation and scalability analysis of 3d fast fourier transform using 2d domain decomposition. Parallel Computing (submitted, 2012) (under review)
Frigo, M., Johnson, S.: Fftw: An adaptive software architecture for the fft. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 3, pp. 1381–1384. IEEE (1998)
Bluefire User Guide, http://www2.cisl.ucar.edu/docs/bluefire-user-guide
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Orozco, D., Garcia, E., Pavel, R., Ayala, O., Wang, LP., Gao, G. (2012). Demystifying Performance Predictions of Distributed FFT3D Implementations. In: Park, J.J., Zomaya, A., Yeo, SS., Sahni, S. (eds) Network and Parallel Computing. NPC 2012. Lecture Notes in Computer Science, vol 7513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35606-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-35606-3_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35605-6
Online ISBN: 978-3-642-35606-3
eBook Packages: Computer ScienceComputer Science (R0)