TY - JOUR
T1 - Performance of SOR methods on modern vector and scalar processors
AU - Soga, Takashi
AU - Musa, Akihiro
AU - Okabe, Koki
AU - Komatsu, Kazuhiko
AU - Egawa, Ryusuke
AU - Takizawa, Hiroyuki
AU - Kobayashi, Hiroaki
AU - Takahashi, Shun
AU - Sasaki, Daisuke
AU - Nakahashi, Kazuhiro
N1 - Funding Information:
This research was partially supported by Grant-in-Aid for Scientific Research (S) #21226018 .
PY - 2011/6
Y1 - 2011/6
N2 - The building-cube method (BCM) is a new generation algorithm for CFD simulations. The basic idea of BCM is to simplify the algorithm in all stages of flow computation to achieve large-scale simulations. Calculation of a pressure field using the Successive Over Relaxation (SOR) method consumes most of the total execution time required for BCM. In this paper, effective implementations on modern vector and scalar processors are investigated. NEC SX-9 and Intel Nehalem-EX are the latest vector and scalar processors. Those processors have much higher peak performances than their previous-generation processors. However, their memory bandwidth improvement cannot catch up with the performance improvement of processors. This is the so-called memory wall problem. In our paper, we discuss optimization techniques for implementation of the SOR method based on architectural characteristics of these modern processors, and evaluate their effects on the sustained performances of these processors for BCM.
AB - The building-cube method (BCM) is a new generation algorithm for CFD simulations. The basic idea of BCM is to simplify the algorithm in all stages of flow computation to achieve large-scale simulations. Calculation of a pressure field using the Successive Over Relaxation (SOR) method consumes most of the total execution time required for BCM. In this paper, effective implementations on modern vector and scalar processors are investigated. NEC SX-9 and Intel Nehalem-EX are the latest vector and scalar processors. Those processors have much higher peak performances than their previous-generation processors. However, their memory bandwidth improvement cannot catch up with the performance improvement of processors. This is the so-called memory wall problem. In our paper, we discuss optimization techniques for implementation of the SOR method based on architectural characteristics of these modern processors, and evaluate their effects on the sustained performances of these processors for BCM.
KW - Building-cube method
KW - SOR method
KW - Vector and scalar processing
UR - http://www.scopus.com/inward/record.url?scp=79954615158&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79954615158&partnerID=8YFLogxK
U2 - 10.1016/j.compfluid.2010.12.024
DO - 10.1016/j.compfluid.2010.12.024
M3 - Article
AN - SCOPUS:79954615158
SN - 0045-7930
VL - 45
SP - 215
EP - 221
JO - Computers and Fluids
JF - Computers and Fluids
IS - 1
ER -