[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

GPU Computations and Memory Access Model Based on Petri Nets

  • Chapter
  • First Online:
Transactions on Petri Nets and Other Models of Concurrency XIII

Part of the book series: Lecture Notes in Computer Science ((TOPNOC,volume 11090))

Abstract

In modern systems CPUs as well as GPUs are equipped with multi-level memory architectures, where different levels of the hierarchy vary in latency and capacity. Therefore, various memory access models were studied. Such a model can be seen as an interface abstracting the user from the physical architecture details. In this paper we present a general and uniform GPU computation and memory access model based on bounded inhibitor Petri nets (PNs). Its effectiveness is demonstrated by comparing its throughputs to practical computational experiments performed with the usage of Nvidia GPU with CUDA architecture.

Our PN model is consistent with the workflow of multithreaded GPU streaming multiprocessors. It models a selection and execution of instructions for each warp. The three types of instructions included in the model are: the arithmetic operation, the access to the shared memory and the access to the global memory. For a given algorithm the model allows to check how efficient the parallelization is, and whether a different organization of threads will improve performance.

The accuracy of our model was tested with different kernels. As the preliminary experiments we used the matrix multiplication program and stability example created by Nvidia, and as the main experiment a binary version of the least significant digit radix sort algorithm. We created three implementations of the algorithm using CUDA architecture, differing in the usage of shared and global memory as well as organization of calculations. For each implementation the PN model was used and the results of experiments are presented in the work.

This research has been supported by the Polish National Science Center through grant No. 2013/09/D/ST6/03928.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

eBook
GBP 12.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Note that in the case of bounded nets the use of inhibitors is not necessary, one can provide an equivalent (with more complex structure) net without inhibitors.

References

  1. Chiola, G., Donatelli, S., Franceschinis, G.: Priorities, inhibitor arcs and concurrency in P/T nets. In: Proceedings of ICATPN, vol. 91, pp. 182–205 (1991)

    Google Scholar 

  2. Nvidia Corporation. CUDA. http://www.nvidia.com/object/cuda_home_new.html

  3. Nvidia Corporation. CUDA. Best practice guide version 8.0.61 (2017)

    Google Scholar 

  4. Corporation, Nvidia: CUDA. C programming guide version 7.5 (2017)

    Google Scholar 

  5. Duvanenko, V.J.: Algorithm improvement through performance measurement (2009). http://www.drdobbs.com/architecture-and-design/algorithm-improvement-through-performanc/220000504

  6. Grama, A.: Introduction to Parallel Computing. Pearson Education, London (2003)

    Google Scholar 

  7. Hong, S., Kim, H.: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In: ACM SIGARCH Computer Architecture News, vol. 37, pp. 152–163. ACM (2009)

    Google Scholar 

  8. Hwang, K., Jotwani, N.: Advanced Computer Architecture, 3rd edn. McGraw-Hill Education, New York (2011)

    Google Scholar 

  9. Janicki, R., Koutny, M.: Structure of concurrency. Theoret. Comput. Sci. 112(1), 5–52 (1993)

    Article  MathSciNet  Google Scholar 

  10. Jensen, K., Kristensen, L.M.: Coloured Petri Nets - Modelling and Validation of Concurrent Systems. Springer, Heidelberg (2009). https://doi.org/10.1007/b95112

    Book  MATH  Google Scholar 

  11. Luebke, D., Owens, J., Roberts, M., Lee, C.-H.: Intro to Parallel Programming - Online Course. Nvidia Corporation, Santa Clara

    Google Scholar 

  12. Ma, L., Agrawal, K., Chamberlain, R.D.: A memory access model for highly-threaded many-core architectures. Future Gen. Comput. Syst. 30, 202–215 (2014)

    Article  Google Scholar 

  13. Ma, L., Chamberlain, R.D., Agrawal, K.: Performance modeling for highly-threaded many-core GPUs. In: 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 84–91. IEEE (2014)

    Google Scholar 

  14. Madougou, S., Varbanescu, A.L., de Laat, C.: Using colored petri nets for GPGPU performance modeling. In: Proceedings of the ACM International Conference on Computing Frontiers, pp. 240–249. ACM (2016)

    Google Scholar 

  15. Mukherjee, R.: A performance prediction model for the CUDA GPGPU platform. Ph.D. thesis, International Institute of Information Technology Hyderabad, India (2010)

    Google Scholar 

  16. Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–580 (1989)

    Article  Google Scholar 

  17. Patterson, D.A.: Computer Architecture: A Quantitative Approach. Elsevier, Amsterdam (2011)

    Google Scholar 

  18. Ratzer, A.V., et al.: CPN tools for editing, simulating, and analysing coloured petri nets. In: van der Aalst, W.M.P., Best, E. (eds.) ICATPN 2003. LNCS, vol. 2679, pp. 450–462. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44919-1_28

    Chapter  Google Scholar 

  19. Reisig, W.: Petri Nets: An Introduction, vol. 4. Springer, Heidelberg (2012)

    Google Scholar 

  20. Satish, N., Harris, M., Garland, M.: Designing efficient sorting algorithms for manycore GPUs. In: IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–10. IEEE (2009)

    Google Scholar 

  21. Seward, H.E.: Information sorting in the application of electronic digital computers to business operations, Master Thesis, MIT (1954)

    Google Scholar 

  22. Shuaiwen, S., Chunyi, S., Barry, R., Kirk, C.: A simplified and accurate model of power-performance efficiency on emergent GPU architecure. In: 2013 IEEE 27th International Symposium on Parallel & Distributed Processing (IPDPS), pp. 673–686 (2013)

    Google Scholar 

  23. Storti, D., Yurtoglu, M.: CUDA for Engineers: An Introduction to High-performance Parallel Computing. Addison-Wesley Professional, Boston (2015)

    Google Scholar 

  24. Westergaard, M., (Eric) Verbeek, H.M.W.: CPN Tools official webpage. Eindhoven University of Technology

    Google Scholar 

  25. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)

    Article  Google Scholar 

  26. Cheng, S.-T., Hung, Y.: Estimation of job execution time in mapreduce framework over GPU clusters. In: The Fifth International Conference on Performance, Safety and Robustness in Complex Systems and Applications, PESARO 2015 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Gogolińska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer-Verlag GmbH Germany, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Gogolińska, A., Mikulski, Ł., Piątkowski, M. (2018). GPU Computations and Memory Access Model Based on Petri Nets. In: Koutny, M., Kristensen, L., Penczek, W. (eds) Transactions on Petri Nets and Other Models of Concurrency XIII. Lecture Notes in Computer Science(), vol 11090. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-58381-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-58381-4_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-58380-7

  • Online ISBN: 978-3-662-58381-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics