TY - JOUR
T1 - Scalar tuning of a fluid solver using compact scheme for a supercomputer with a distributed memory architecture
AU - Aono, Hikaru
AU - Nonomura, Taku
AU - Iizuka, Nobuyuki
AU - Ohsako, Takahiko
AU - Inari, Tomohide
AU - Hashimoto, Yasutoshi
AU - Takaki, Ryoji
AU - Fujii, Kozo
PY - 2013
Y1 - 2013
N2 - The scalar tuning of a compressible fluid solver for a supercomputer with a distributed memory architecture is conducted. We use the K computer which is one of the peta-scale supercomputers recently developed in Japan. A computational code "LANS3D" and its high-order compact differencing option are tuned. The original version of the code achieves approximately 4.5% of full performance of CPU for the simple test case. Scalar tuning based on combining do-loops works well, and the tuned code attains about 10% of full performance for the same case. The reasons are the improvement in the use of the cache, the suppression of the data transfer, and the efficient use of the data that once transferred to the cache from the memory that results in hiding the low speed of data transfer. The tuned code becomes twice faster than the original one in the wall-clock time and enables us to perform over-160-case parametric study about airfoil flow computation by large-eddy simulations with high-order accurate and high resolution numerical scheme.
AB - The scalar tuning of a compressible fluid solver for a supercomputer with a distributed memory architecture is conducted. We use the K computer which is one of the peta-scale supercomputers recently developed in Japan. A computational code "LANS3D" and its high-order compact differencing option are tuned. The original version of the code achieves approximately 4.5% of full performance of CPU for the simple test case. Scalar tuning based on combining do-loops works well, and the tuned code attains about 10% of full performance for the same case. The reasons are the improvement in the use of the cache, the suppression of the data transfer, and the efficient use of the data that once transferred to the cache from the memory that results in hiding the low speed of data transfer. The tuned code becomes twice faster than the original one in the wall-clock time and enables us to perform over-160-case parametric study about airfoil flow computation by large-eddy simulations with high-order accurate and high resolution numerical scheme.
KW - Compact scheme
KW - Compressible fluid solver
KW - Large scale computation
KW - Large-eddy simulation
KW - Scalar tuning
UR - http://www.scopus.com/inward/record.url?scp=84891650729&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84891650729&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:84891650729
SN - 2180-1363
VL - 5
SP - 143
EP - 152
JO - CFD Letters
JF - CFD Letters
IS - 4
ER -