[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Geographical Locality and Dynamic Data Migration for OpenMP Implementations of Adaptive PDE Solvers

  • Conference paper
OpenMP Shared Memory Parallel Programming (IWOMP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4315))

Included in the following conference series:

Abstract

On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geographical locality. In this article, we study the performance of a parallel PDE solver with adaptive mesh refinement. The solver is parallelized using OpenMP and the adaptive mesh refinement makes dynamic load balancing necessary. Due to the dynamically changing memory access pattern caused by the runtime adaption, it is a challenging task to achieve a high degree of geographical locality.

The main conclusions of the study are: (1) that geographical locality is very important for the performance of the solver, (2) that the performance can be improved significantly using dynamic page migration of misplaced data, (3) that a migrate-on-next-touch directive works well whereas the first-touch strategy is less advantageous for programs exhibiting a dynamically changing memory access patterns, and (4) that the overhead for such migration is low compared to the total execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Wilson, K.M., Aglietti, B.B.: Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C. In: Proceedings of the 2001 ACM/IEEE conference on Supercomputing, pp. 33–33. ACM Press, New York (2001)

    Chapter  Google Scholar 

  2. Corbalan, J., Martorell, X., Labarta, J.: Evaluation of the memory page migration influence in the system performance: the case of the SGI O2000. In: Proceedings of the 17th annual international conference on Supercomputing, pp. 121–129. ACM Press, New York (2003)

    Chapter  Google Scholar 

  3. Holmgren, S., Nordén, M., Rantakokko, J., Wallin, D.: Performance of PDE Solvers on a Self-Optimizing NUMA Architecture. Parallel Algorithms and Applications 17(4), 285–299 (2002)

    Article  MathSciNet  Google Scholar 

  4. Bull, J.M., Johnson, C.: Data Distribution, Migration and Replication on a cc-NUMA Architecture. In: Proceedings of the Fourth European Workshop on OpenMP (2002), http://www.caspur.it/ewomp2002/

  5. Rendleman, C.A.: Parallelization of structured, hiearchical adaptive mesh refinement algorithms. Computing and Visualization in Science 3, 147–157 (2000)

    Article  MATH  Google Scholar 

  6. Deiterding, R.: Construction and application of an amr algorithm for distributed memory computers. In: Adaptive Mesh Refinement – Theory and Applications, Proc. of the Chicago Workshop on Adaptive Mesh Refinement Methods, pp. 361–372. Springer, Heidelberg (2003)

    Google Scholar 

  7. MacNeice, P.: Paramesh: A parallel adaptive mesh refinement community toolkit. Computer physics communications 126, 330–354 (2000)

    Article  MATH  Google Scholar 

  8. Parashar, M., Browne, J.: System engineering for high performance computing software: The hdda/dagh infrastructure for implementation of parallel structured adaptive mesh refinement. In: IMA Volume on Structured Adaptive Mesh Refinement (SAMR) Grid Methods, pp. 1–18 (2000)

    Google Scholar 

  9. Wissink, A.M., Hornung, R.D., Kohn, S.R., Smith, S.S., Elliott, N.: Large scale parallel structured amr calculations using the samrai framework. In: Proceedings of SC 2001 (2001)

    Google Scholar 

  10. Steensland, J.: Efficient partitioning of structured dynamic grid hierarchies. Doctoral thesis, Scientific computing, Department of Information Technology, University of Uppsala, Uppsala dissertations from the faculty of science and technology 44 (2002)

    Google Scholar 

  11. Schloegel, K., Karypis, G., Kumar, V.: A unified algorithm for load-balancing adaptive scientific simulations. In: Proceedings Supercomputing 2000 (2000)

    Google Scholar 

  12. Dreher, J., Grauer, R.: Racoon: A parallel mesh-adaptive framework for hyperbolic conservation laws. Parallel Computing 31, 913–932 (2005)

    Article  MathSciNet  Google Scholar 

  13. Maerten, B.: Drama: A library for parallel dynamic load balancing of finite element applications. In: Amestoy, P.R., Berger, P., Daydé, M., Duff, I.S., Frayssé, V., Giraud, L., Ruiz, D. (eds.) Euro-Par 1999. LNCS, vol. 1685, pp. 313–316. Springer, Heidelberg (1999)

    Google Scholar 

  14. Walshaw, C., Cross, M., Everett, M.: Parallel dynamic graph partitioning for adaptive unstructured meshes. Parallel Distributed Computing 47(2), 102–108 (1997)

    Article  Google Scholar 

  15. Rantakokko, J.: Partitioning strategies for structured multiblock grids. Parallel Computing 26, 1661–1680 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  16. Steensland, J., Söderberg, S., Thuné, M.: A comparison of partitioning schemes for blockwise parallel samr algorithms. In: Sørevik, T., Manne, F., Moe, R., Gebremedhin, A.H. (eds.) PARA 2000. LNCS, vol. 1947, pp. 160–169. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  17. Balsara, D., Norton, C.: Highly parallel structured adaptive mesh refinement using parallel language-based approaches. Parallel Computing 27, 37–70 (2001)

    Article  MATH  Google Scholar 

  18. Rantakokko, J.: Comparison of parallelization models for structured adaptive mesh refinement. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 615–623. Springer, Heidelberg (2004)

    Google Scholar 

  19. Ferm, L., Lötsetdt, P.: Space-time adaptive solutions of first order pdes. Journal of Scientific Computing 26(1), 83–110 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  20. Sun Microsystems: Solaris Memory Placement Optimization and Sun Fire servers (2003), http://www.sun.com/servers/wp/docs/mpo_v7_CUSTOMER.pdf

  21. Teller, P.J.: Tranlation-Lookaside Buffer Consistency. Computer 23(6), 26–36 (1990)

    Article  Google Scholar 

  22. Bircsak, J., Craig, P., Crowell, R., Cvetanovic, Z., Harris, J., Nelson, C.A., Offner, C.D.: Extending OpenMP for NUMA machines. Scientific Programming 8, 163–181 (2000)

    Google Scholar 

  23. Laudon, J., Lenoski, D.: The SGI Origin: a ccNUMA highly scalable server. In: Proceedings of the 24th annual international symposium on Computer architecture, pp. 241–251. ACM Press, New York (1997)

    Google Scholar 

  24. Tikir, M.M., Hollingsworth, J.K.: Using Hardware Counters to Automatically Improve Memory Performance. In: SC 2004: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, Washington, DC, USA, p. 46. IEEE Computer Society, Los Alamitos (2004)

    Google Scholar 

  25. Spiegel, A., an Mey, D.: Hybrid Parallelization with Dynamic Thread Balancing on a ccNUMA System. In: Brorson, M. (ed.) Proceedings of the 6th European Workshop on OpenMP, Royal Institute of Technology (KTH), Sweden, pp. 77–81 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Matthias S. Mueller Barbara M. Chapman Bronis R. de Supinski Allen D. Malony Michael Voss

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nordén, M., Löf, H., Rantakokko, J., Holmgren, S. (2008). Geographical Locality and Dynamic Data Migration for OpenMP Implementations of Adaptive PDE Solvers. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds) OpenMP Shared Memory Parallel Programming. IWOMP 2005. Lecture Notes in Computer Science, vol 4315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68555-5_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68555-5_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68554-8

  • Online ISBN: 978-3-540-68555-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics