摘要: 区域气候模式CWRF(Climate-Weather Research and Forecasting model)是国家气候中心区域气候预测系统的重要组成部分,也是系统最耗时的程序。高性能计算是提高CWRF数值预报计算性能的关键技术,开展CWRF模式在国产神威众核架构上的移植和优化,提高模式的模拟效率,对模式的扩展、开发能力和可持续发展具有重要意义。基于国产众核SW26010处理器,完成了CWRF区域气候模式的移植、性能分析和深入性能优化,采用访存优化、Cache命中率优化及众核加速优化等方法,对CWRF模式动力过程、物理过程和I/O过程计算代码进行重构及众核加速。结果表明:优化技术可使CWRF动力过程平均加速2倍,最高加速6.4倍,物理过程平均加速1.7倍,最高加速5.4倍,I/O过程加速1.2倍,程序整体最高加速1.4倍,计算误差在合理范围内。
中圖分類號(hào): TP391 文獻(xiàn)標(biāo)識(shí)碼: A DOI:10.16157/j.issn.0258-7998.212397 中文引用格式: 呂小敬,劉釗,蔡蕙伊,等. 面向國(guó)產(chǎn)神威眾核架構(gòu)的區(qū)域氣候模式CWRF性能優(yōu)化技術(shù)[J].電子技術(shù)應(yīng)用,2022,48(1):31-38. 英文引用格式: Lv Xiaojing,Liu Zhao,Cai Huiyi,et al. Optimization technology for regional climate model-CWRF based on domestic Sunway many-core architecture[J]. Application of Electronic Technique,2022,48(1):31-38.
Optimization technology for regional climate model-CWRF based on domestic Sunway many-core architecture
Lv Xiaojing1,2,Liu Zhao2,3,Cai Huiyi2,Li Jinwei2
1.China Ship Scientific Research Center,Wuxi 214000,China; 2.National Supercomputing Center in Wuxi,Wuxi 214000,China;3.Tsinghua University,Beijing 100080,China
Abstract: CWRF(Climate-Weather Research and Forecasting model) is a component of the regional climate prediction system built in the National Climate Center, and consumes the largest proportion of time. High performance computing is a key technology used to improve the compactional performance of CWRF. Carrying out the configuration and optimization of the CWRF model based on the domestic Sunway many-core system, improving the simulation efficiency are of great significance for the speedup, as well as the development capability and sustainable development of the model. This paper completed the configuration and performance evaluation of CWRF based on the SW26010 many-core architecture. Memory access optimization, Cache hit rate optimization, many-core acceleration models are introduced to speedup CWRF relating to the dynamic-core process, physical process and I/O process. The results show that the average speed of the dynamic process is 2 times and the highest speed is 6.4 times, the average speed of the physical process is 1.7 times and the highest speed is 5.4 times, the I/O process speeds up 1.2 times, the overall program speeds up to 1.4 times, and the calculation error is reasonable.
Key words : CWRF;high performance computing;Sunway;SW26010