计算机科学 ›› 2022, Vol. 49 ›› Issue (6): 99-107.doi: 10.11896/jsjkx.210400157
陈鑫1, 李芳1, 丁海昕2, 孙唯哲1, 刘鑫1, 陈德训1, 叶跃进1, 何香1
CHEN Xin1, LI Fang1, DING Hai-xin2, SUN Wei-ze1, LIU Xin1, CHEN De-xun1, YE Yue-jin1, HE Xiang1
摘要: 神威太湖之光在2016-2018年度全球超算top500榜单中排名第一,峰值性能为125.4 PFlops,其计算能力主要归功于国产SW26010众核处理器。由于CFD非结构网格计算存在拓扑关系复杂、离散访存问题严重、存在强相关的线化方程求解等问题,导致CFD非结构网格计算一直是国产众核超级计算机移植与优化的难题。为充分发挥国产异构众核架构的计算效能,首先,提出了一种数据重构模型,提高了数据的局部性和可并行性,使得数据结构更加适应众核架构的特点;然后,针对非结构网格数据存放的无序性导致的离散访存问题,提出了一种基于信息关系预存的离散访存优化方法,将离散访存转化为连续访存;最后,对于存在强相关的线化方程求解问题,引入了从核阵列流水线并行的思想,实现了众核并行。优化后CFD非结构网格计算的整体性能相比原始版本提升了4.19倍,相比通用CPU提升了1.2倍,并扩展到62.4万计算核心的并行规模,能保持64.5%的并行效率。
中图分类号:
[1] LIN C L,TAWHAI M H,MCLENNAN G,et al.Computational fluid dynamics[J].IEEE Engineering in Medicine & Biology Magazine,2009,28(3):25-33. [2] XU K,MATHEMATICS D O.Direct modeling for computa-tional fluid dynamics[J].Acta Mechanica Sinica,2015,1(1):303-318. [3] XU C F,DENG X G,ZHANG L L,et al.Parallelizing a High-Order CFD Software for 3D,Multi-block,Structural Grids on the TianHe-1A Supercomputer[C]//International Supercomputing Conference.Berlin,Heidelberg,2013:26-39. [4] CORRIGAN A,CAMELLI F,LOHNER R,et al.Running unstructured grid based CFD solvers on modern graphics hardware[J].International Journal for Numerical Methods in Fluids,2011,66(2):221-229. [5] ABBRUZZESE G,GÓMEZ M,CORDERO-GRACIA M,et al.Unstructured 2D grid generation using overset-mesh cutting and single-mesh reconstruction[J].Aerospace Science & Techno-logy,2018,78:637-647. [6] JAHANDARI H,BIHLO A.Forward modelling of geophysical electromagnetic data on unstructured grids using an adaptive mimetic finite-difference method[J].Computational Geosciences,2021,25:1083-1104. [7] CHEN S S,HUA Y,CAI F J,et al.Multi-dimensional dissipation strategy within advection upstream splitting methods in hypersonic flows[J].Journal of Physics:Conference Series,2021,1786(1):012050. [8] DLA C,MP A,RL A,et al.Tracer transport within an unstructured grid ocean model using characteristic discontinuous Galerkin advection -ScienceDirect[J].Computers & Mathematics with Applications,2019,78(2):611-622. [9] CAI X,ZHANG Y J,SHEN J,et al.A Numerical Study of Hypoxia in Chesapeake Bay Using an Unstructured Grid Model:Validation and Sensitivity to Bathymetry Representation[J].JAWRA Journal of the American Water Resources Association,2020,10:1-24. [10] FUJITA K,HORIKOSHI M,ICHIMURA T,et al.Develop-ment of Element-by-Element Kernel Algorithms in Unstructured Finite-Element Solvers for Many-Core Wide-SIMD CPUs:Application to Earthquake Simulation[J].Journal of Computational Science,2020,45:1-11. [11] SHARMA V,ESWARAN V,CHAKRABORTY D,et al.Determination of optimal spacing between transverse jets in a SCRAMJET engine[J].Aerospace Science and Technology,2020,96:1-12. [12] LI F,LI Z H,XU J X.Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System[J].Computer Science,2020,47(1):24-30. [13] LI R,WANG X,ZHAO W B.A Multigrid Block LU-SGS Algorithm for Euler Equations on Unstructured Grids[J].Numerical Mathematics Theory Methods & Applications,2008,1(1):1-25. [14] LI W,LUO L S.An implicit block LU-SGS finite-volume lattice-Boltzmann scheme for steady flows on arbitrary unstructured meshes[J].Journal of Computational Physics,2016,20(2):503-518. [15] FU H H,LIAO J F,YANG J Z,et al.The Sunway Taihu Light supercomputer:system and applications[J].Science China(Information Sciences),2016,59(7):113-128. [16] LIN H,TANG X,YU B,et al.Scalable Graph Traversal onSunway TaihuLight with Ten Million Cores[C]//2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).IEEE,2017. [17] LIN J,XU Z,NUKADA A.Optimizations of Two Computebound Scientific Kernelson SW26010 Manycore Processor[C]//Proceedings of the 46th International Conference on Parallel Processing.IEEE,2017. [18] DONGARR J.Sunway TaihuLight supercomputer makes its appearance[J].National Science Review,2016,3(3):265-266. [19] LIU X,LU L S,CHEN D X,et al.Research on Pre-processing Methods of Unstructured Grids[J].Computer Science,2012,39(3):308-311. [20] MENG D L,WEN M H,WEI J W,et al.Porting and Optimizing OpenFOAM on Sunway TaihuLight System[J].Computer Science,2017,44(10):64-70. [21] NI H,LIU X.Unstructured grid many-core optimization technology based on Sunway·Taihulight[J].Computer Enginee-ring,2019,45(6):45-51. [22] XU T H.GPU implementation of compressible viscous flow numerical method based on unstructured mesh[D].Nanjing:Nanjing University of Aeronautics and Astronautics,2016. [23] CHEN L,XU T H,TIAN S L.Research on GPU Acceleration of Implicit Schemes Based on Unstructured Grids[J].Computer System Application,2018,27(5):238-243. [24] SINGH M,SINGH R,SINGH S,et al.Discrete Finite VolumeApproach for Multidimensional Agglomeration Population Ba-lance Equation on Unstructured Grid[J].Powder Technology,2020,376:229-240. [25] ZHOU S,WEI W,GUO X.Notice of Retraction Unstructuredgrid finite volume method for NS equation[C]//International Conference on Computer Application & System Modeling.IEEE,2010. [26] BOCHAROV A N,EVSTIGNEEV N M,RYABKOV O I.Fully implicit multiple graphics processing units’ schemes for hypersonic flows with lower upper symmetric Gauss-Seidel preconditioner on unstructured non-orthogonal grids[J].Journal of Physics:Conference Series,2020,1698(1):1-13. [27] WANG L.Parallel Numerical Simulations of the Whole Scramjet Engine Flowfields on Unstructured grids[D].Mianyan:China Aerodynamics Research and Development Center,2007. [28] HORSTMAN C,SETTLES G S,WILLIAMS D R,et al.A Reattaching Free Shear Layer in Compressible Turbulent Flow[J].AIAA Journal,1982,20(1):79-85. [29] BYNUM M,BAURLE R.A Design of Experiments Study forthe HIFiRE Flight 2 Ground Test Computational Fluid Dyna-mics Results[C]//17th AIAA International Space Planes and Hypersonic Systems and Technologies Conference.2013. |
[1] | 叶跃进, 李芳, 陈德训, 郭恒, 陈鑫. 基于国产众核架构的非结构网格分区块重构预处理算法研究 Study on Preprocessing Algorithm for Partition Reconnection of Unstructured-grid Based on Domestic Many-core Architecture 计算机科学, 2022, 49(6): 73-80. https://doi.org/10.11896/jsjkx.210900045 |
[2] | 刘江, 刘文博, 张矩. OpenFoam中多面体网格生成的MPI+OpenMP混合并行方法 Hybrid MPI+OpenMP Parallel Method on Polyhedral Grid Generation in OpenFoam 计算机科学, 2022, 49(3): 3-10. https://doi.org/10.11896/jsjkx.210700060 |
[3] | 傅天豪, 田鸿运, 金煜阳, 杨章, 翟季冬, 武林平, 徐小文. 一种面向构件化并行应用程序的性能骨架分析方法 Performance Skeleton Analysis Method Towards Component-based Parallel Applications 计算机科学, 2021, 48(6): 1-9. https://doi.org/10.11896/jsjkx.201200115 |
[4] | 何亚茹, 庞建民, 徐金龙, 朱雨, 陶小涵. 基于神威平台的Floyd并行算法的实现和优化 Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform 计算机科学, 2021, 48(6): 34-40. https://doi.org/10.11896/jsjkx.201100051 |
[5] | 冯凯, 马鑫玉. (n,k)-冒泡排序网络的子网络可靠性 Subnetwork Reliability of (n,k)-bubble-sort Networks 计算机科学, 2021, 48(4): 43-48. https://doi.org/10.11896/jsjkx.201100139 |
[6] | 胡蓉, 阳王东, 王昊天, 罗辉章, 李肯立. 基于GPU加速的并行WMD算法 Parallel WMD Algorithm Based on GPU Acceleration 计算机科学, 2021, 48(12): 24-28. https://doi.org/10.11896/jsjkx.210600213 |
[7] | 马梦宇, 吴烨, 陈荦, 伍江江, 李军, 景宁. 显示导向型的大规模地理矢量实时可视化技术 Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data 计算机科学, 2020, 47(9): 117-122. https://doi.org/10.11896/jsjkx.190800121 |
[8] | 陈国良, 张玉杰. 并行计算学科发展历程 Development of Parallel Computing Subject 计算机科学, 2020, 47(8): 1-4. https://doi.org/10.11896/jsjkx.200600027 |
[9] | 阳王东, 王昊天, 张宇峰, 林圣乐, 蔡沁耘. 异构混合并行计算综述 Survey of Heterogeneous Hybrid Parallel Computing 计算机科学, 2020, 47(8): 5-16. https://doi.org/10.11896/jsjkx.200600045 |
[10] | 郭杰, 高希然, 陈莉, 傅游, 刘颖. 用数据驱动的编程模型并行多重网格应用 Parallelizing Multigrid Application Using Data-driven Programming Model 计算机科学, 2020, 47(8): 32-40. https://doi.org/10.11896/jsjkx.200500093 |
[11] | 袁欣辉, 林蓉芬, 魏迪, 尹万旺, 徐金秀. 面向国产异构众核处理器SW26010的BFS优化方法 Optimization of BFS on Domestic Heterogeneous Many-core Processor SW26010 计算机科学, 2020, 47(8): 98-104. https://doi.org/10.11896/jsjkx.191000013 |
[12] | 冯凯, 李婧. k元n方体的子网络可靠性研究 Study on Subnetwork Reliability of k-ary n-cubes 计算机科学, 2020, 47(7): 31-36. https://doi.org/10.11896/jsjkx.190700170 |
[13] | 杨宗霖, 李天瑞, 刘胜久, 殷成凤, 贾真, 珠杰. 基于Spark Streaming的流式并行文本校对 Streaming Parallel Text Proofreading Based on Spark Streaming 计算机科学, 2020, 47(4): 36-41. https://doi.org/10.11896/jsjkx.190300070 |
[14] | 邓定胜. 一种改进的DBSCAN算法在Spark平台上的应用 Application of Improved DBSCAN Algorithm on Spark Platform 计算机科学, 2020, 47(11A): 425-429. https://doi.org/10.11896/jsjkx.190700071 |
[15] | 徐传福,王曦,刘舒,陈世钊,林玉. 基于Python的大规模高性能LBM多相流模拟 Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python 计算机科学, 2020, 47(1): 17-23. https://doi.org/10.11896/jsjkx.190500009 |
|