TY - GEN
T1 - Exploiting the Potentials of the Second Generation SX-Aurora TSUBASA
AU - Egawa, Ryusuke
AU - Fujimoto, Souya
AU - Yamashita, Tsuyoshi
AU - Sasaki, Daisuke
AU - Isobe, Yoko
AU - Shimomura, Yoichi
AU - Takizawa, Hiroyuki
N1 - Funding Information:
This work was supported in part by MEXT as “Next Generation High-Performance Computing Infrastructures and Applications R&D Program,” (R&D of A Quantum-Annealing-Assisted Next Generation HPC Infrastructure and its Applications), and ”Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures” in Japan (Project ID: jh200007-NAH).
Publisher Copyright:
© 2020 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - NEC SX-series vector supercomputers have provided outstanding memory bandwidths to meet the strong demands for efficient execution of memory-intensive scientific applications in practice. Inheriting the advantage, the 2nd generation SX-Aurora TSUBASA, Type 20B, provides an extremely high memory bandwidth of 1.53 TB/s per vector processor. Unlike conventional SX-series systems, SX-Aurora TSUBASA also offers various execution modes to execute a diversity of emerging scientific workloads efficiently. As a result, application developers need to understand their workloads and the performance characteristics of SX-Aurora TSUBASA, and select an optimization strategy assuming an appropriate execution mode to fully exploit the system performance. Therefore, this paper discusses workload characterization by performance bottleneck analysis to determine an optimization strategy for the 2nd generation SX-Aurora TSUBASA. The evaluation results with benchmarks and real-world applications demonstrate that the workload characterization approach can accurately find the bottleneck and characterize various workloads, by helping application developers decide the optimization strategies for individual workloads. Since we can consider SX-Aurora TSUBASA as a typical example of the latest processors with high memory bandwidths, the workload characterization approach will also be helpful for other future processors.
AB - NEC SX-series vector supercomputers have provided outstanding memory bandwidths to meet the strong demands for efficient execution of memory-intensive scientific applications in practice. Inheriting the advantage, the 2nd generation SX-Aurora TSUBASA, Type 20B, provides an extremely high memory bandwidth of 1.53 TB/s per vector processor. Unlike conventional SX-series systems, SX-Aurora TSUBASA also offers various execution modes to execute a diversity of emerging scientific workloads efficiently. As a result, application developers need to understand their workloads and the performance characteristics of SX-Aurora TSUBASA, and select an optimization strategy assuming an appropriate execution mode to fully exploit the system performance. Therefore, this paper discusses workload characterization by performance bottleneck analysis to determine an optimization strategy for the 2nd generation SX-Aurora TSUBASA. The evaluation results with benchmarks and real-world applications demonstrate that the workload characterization approach can accurately find the bottleneck and characterize various workloads, by helping application developers decide the optimization strategies for individual workloads. Since we can consider SX-Aurora TSUBASA as a typical example of the latest processors with high memory bandwidths, the workload characterization approach will also be helpful for other future processors.
KW - Bottleneck Analysis
KW - Bytes/Flops rate
KW - Performance Tuning Strategy
KW - Vector Computer
UR - http://www.scopus.com/inward/record.url?scp=85099578204&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099578204&partnerID=8YFLogxK
U2 - 10.1109/PMBS51919.2020.00010
DO - 10.1109/PMBS51919.2020.00010
M3 - Conference contribution
AN - SCOPUS:85099578204
T3 - Proceedings of PMBS 2020: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 39
EP - 49
BT - Proceedings of PMBS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2020
Y2 - 12 November 2020
ER -