TY - GEN
T1 - NeoSYCL
T2 - 2021 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021
AU - Ke, Yinan
AU - Agung, Mulya
AU - Takizawa, Hiroyuki
N1 - Funding Information:
This work partially supported by MEXT Next Generation High-Performance Computing Infrastructures and Applications R&D Program "R&D of A Quantum-Annealing-Assisted Next Generation HPC Infrastructure and its Applications" Grant-in-Aid for Scientific Research (A) #20H00593.
Publisher Copyright:
© 2021 Owner/Author.
PY - 2021/1/20
Y1 - 2021/1/20
N2 - Recently, the high-performance computing world has moved to more heterogeneous architectures. Thus, it has become a standard practice to offload a part of application execution to dedicated accelerators. However, the disadvantage in productivity is still a problem in programming for accelerators. This paper proposes neoSYCL: a SYCL implementation for SX-Aurora TSUBASA, aiming to improve productivity and achieve comparable performance with native implementations. Unlike other implementations, neoSYCL can identify and separate the kernel part of the SYCL code at the source code level.Thus, this approach can easily be moved to any heterogeneous architectures using the offload programming model. In this paper, we show the evaluation results on SX-Aurora TSUBASA. To quantitatively discuss not only performance but also the productivity, we use two different benchmarks and code-complexity metrics for the evaluation. The results show that neoSYCL can improve productivity while reaching the same performance as native implementations.
AB - Recently, the high-performance computing world has moved to more heterogeneous architectures. Thus, it has become a standard practice to offload a part of application execution to dedicated accelerators. However, the disadvantage in productivity is still a problem in programming for accelerators. This paper proposes neoSYCL: a SYCL implementation for SX-Aurora TSUBASA, aiming to improve productivity and achieve comparable performance with native implementations. Unlike other implementations, neoSYCL can identify and separate the kernel part of the SYCL code at the source code level.Thus, this approach can easily be moved to any heterogeneous architectures using the offload programming model. In this paper, we show the evaluation results on SX-Aurora TSUBASA. To quantitatively discuss not only performance but also the productivity, we use two different benchmarks and code-complexity metrics for the evaluation. The results show that neoSYCL can improve productivity while reaching the same performance as native implementations.
KW - Heterogeneous computing
KW - LLVM
KW - NEC SX-Aurora
KW - SYCL
UR - http://www.scopus.com/inward/record.url?scp=85099878062&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099878062&partnerID=8YFLogxK
U2 - 10.1145/3432261.3432268
DO - 10.1145/3432261.3432268
M3 - Conference contribution
AN - SCOPUS:85099878062
T3 - ACM International Conference Proceeding Series
SP - 50
EP - 57
BT - Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021
PB - Association for Computing Machinery
Y2 - 20 January 2021 through 22 January 2021
ER -