TY - JOUR
T1 - Optimizing Load Balance in a Parallel CFD Code for a Large-scale Turbine Simulation on a Vector Supercomputer
AU - Watanabe, Osamu
AU - Komatsu, Kazuhiko
AU - Sato, Masayuki
AU - Kobayashi, Hiroaki
N1 - Funding Information:
This research was supported in part by MEXT as “Next Generation High-Performance Computing Infrastructures and Applications R&D Program,” entitled “R&D of A Quantum-Annealing-Assisted Next Generation HPC Infrastructure and its Applications.” The authors thank Satoru Yamamoto, Takashi Furusawa, and Hironori Miyazawa of Tohoku University for their fruitful discussions and variable comments. This paper is distributed under the terms of the Creative Commons Attribution-Non Commercial 3.0 License which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is properly cited.
Funding Information:
This research was supported in part by MEXT as “Next Generation High-Performance Computing Infrastructures and Applications R&D Program,” entitled “R&D of A Quantum-Annealing-Assisted Next Generation HPC Infrastructure and its Applications.” The authors thank Satoru Yamamoto, Takashi Furusawa, and Hironori Miyazawa of Tohoku University for their fruitful discussions and variable comments.
Publisher Copyright:
© 2021. The Authors. All Rights Reserved.
PY - 2021
Y1 - 2021
N2 - A turbine for power generation is one of the essential infrastructures in our society. A turbine’s failure causes severe social and economic impacts on our everyday life. Therefore, it is necessary to foresee such failures in advance. However, it is not easy to expect these failures from a real turbine. Hence, it is required to simulate various events occurring in the turbine by numerical simulations of the turbine. A multiphysics CFD code, “Numerical Turbine,” has been developed on vector supercomputer systems for large-scale simulations of unsteady wet steam flows inside a turbine. To solve this problem, the Numerical Turbine code is a block structure code using MPI parallelization, and the calculation space consists of grid blocks of different sizes. Therefore, load imbalance occurs when executing the code in MPI parallelization. This paper creates an estimation model that finds the calculation time from each grid block’s calculation amount and calculation performance. It proposes an OpenMP parallelization method for the load balance of MPI applications. This proposed method reduces the load imbalance by considering the vector performance according to the calculation amount based on the model. Moreover, this proposed method recognizes the need to reduce the load imbalance without pre-execution. The performance evaluation shows that the proposed method improves the load balance from 24.4 % to 9.3 %.
AB - A turbine for power generation is one of the essential infrastructures in our society. A turbine’s failure causes severe social and economic impacts on our everyday life. Therefore, it is necessary to foresee such failures in advance. However, it is not easy to expect these failures from a real turbine. Hence, it is required to simulate various events occurring in the turbine by numerical simulations of the turbine. A multiphysics CFD code, “Numerical Turbine,” has been developed on vector supercomputer systems for large-scale simulations of unsteady wet steam flows inside a turbine. To solve this problem, the Numerical Turbine code is a block structure code using MPI parallelization, and the calculation space consists of grid blocks of different sizes. Therefore, load imbalance occurs when executing the code in MPI parallelization. This paper creates an estimation model that finds the calculation time from each grid block’s calculation amount and calculation performance. It proposes an OpenMP parallelization method for the load balance of MPI applications. This proposed method reduces the load imbalance by considering the vector performance according to the calculation amount based on the model. Moreover, this proposed method recognizes the need to reduce the load imbalance without pre-execution. The performance evaluation shows that the proposed method improves the load balance from 24.4 % to 9.3 %.
KW - MPI
KW - OpenMP
KW - hybrid parallelization
KW - load balance
KW - turbine simulation code
KW - vector supercomputer
UR - http://www.scopus.com/inward/record.url?scp=85118799375&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118799375&partnerID=8YFLogxK
U2 - 10.14529/js210207
DO - 10.14529/js210207
M3 - Article
AN - SCOPUS:85118799375
SN - 2409-6008
VL - 8
SP - 114
EP - 130
JO - Supercomputing Frontiers and Innovations
JF - Supercomputing Frontiers and Innovations
IS - 2
ER -