TY - JOUR
T1 - Parallel processing of the Building-Cube Method on a GPU platform
AU - Komatsu, Kazuhiko
AU - Soga, Takashi
AU - Egawa, Ryusuke
AU - Takizawa, Hiroyuki
AU - Kobayashi, Hiroaki
AU - Takahashi, Shun
AU - Sasaki, Daisuke
AU - Nakahashi, Kazuhiro
N1 - Funding Information:
The authors would like to thank to the reviewers for their thoughtful review and helpful comments. This research was partially supported by Grant-in-Aid for Scientific Research (S) #21226018 ; Grant-in-Aid for Young Scientists (B) #21700049 ; NAKAYAMA HAYAO Foundation for Science & Technology and Culture ; Core Research of Evolutional Science and Technology of Japan Science and Technology Agency (JST CREST) .
PY - 2011/6
Y1 - 2011/6
N2 - The Building-Cube Method (BCM) based on equally-spaced Cartesian meshes has been proposed as a next generation CFD method. Due to the equally-spaced meshes, it is well suited for highly parallel computation. This paper proposes a parallel implementation scheme of BCM on a GPU cluster system, which needs efficient hierarchical parallel processing to exploit the potential of the cluster system. The proposed scheme employs the Red-Black SOR method for the pressure calculations, which is the most time-consuming part of BCM, to obtain massive data parallelism of BCM. By exploiting the coarse-grain and fine-grain parallelism of BCM, the proposed scheme hierarchically assigns equally-divided tasks into the GPU cluster system. Furthermore, to exploit the computational power of GPUs in the cluster system, the proposed scheme employs an efficient data management such as coalesced data transfer and reusing data on an on-chip memory. Experimental results show that the single GPU implementation can achieve about three times higher performance than the single CPU one. Moreover, the multiple GPU implementation can achieve an almost ideal scalability. Finally, the possibility of further acceleration of not only the pressure calculation but also the whole BCM is discussed.
AB - The Building-Cube Method (BCM) based on equally-spaced Cartesian meshes has been proposed as a next generation CFD method. Due to the equally-spaced meshes, it is well suited for highly parallel computation. This paper proposes a parallel implementation scheme of BCM on a GPU cluster system, which needs efficient hierarchical parallel processing to exploit the potential of the cluster system. The proposed scheme employs the Red-Black SOR method for the pressure calculations, which is the most time-consuming part of BCM, to obtain massive data parallelism of BCM. By exploiting the coarse-grain and fine-grain parallelism of BCM, the proposed scheme hierarchically assigns equally-divided tasks into the GPU cluster system. Furthermore, to exploit the computational power of GPUs in the cluster system, the proposed scheme employs an efficient data management such as coalesced data transfer and reusing data on an on-chip memory. Experimental results show that the single GPU implementation can achieve about three times higher performance than the single CPU one. Moreover, the multiple GPU implementation can achieve an almost ideal scalability. Finally, the possibility of further acceleration of not only the pressure calculation but also the whole BCM is discussed.
KW - Building-Cube Method
KW - GPGPU
KW - Multiple GPUs
UR - http://www.scopus.com/inward/record.url?scp=79954608931&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79954608931&partnerID=8YFLogxK
U2 - 10.1016/j.compfluid.2010.12.019
DO - 10.1016/j.compfluid.2010.12.019
M3 - Article
AN - SCOPUS:79954608931
SN - 0045-7930
VL - 45
SP - 122
EP - 128
JO - Computers and Fluids
JF - Computers and Fluids
IS - 1
ER -