TY - GEN
T1 - Vectorization-aware loop optimization with user-defined code transformations
AU - Takizawa, Hiroyuki
AU - Reimann, Thorsten
AU - Komatsu, Kazuhiko
AU - Soga, Takashi
AU - Egawa, Ryusuke
AU - Musa, Akihiro
AU - Kobayashi, Hiroaki
N1 - Funding Information:
This work is partially supported by JST CREST “An Evolutionary Approach to Construction of a Software Development Environmentfor Massively-Parallel Heterogeneous Systems”, DFG SPPEXA ExaFSA project, and Grant-in-Aid for Scientific Research(B) 16H02822.
Publisher Copyright:
© 2017 IEEE.
PY - 2017/9/22
Y1 - 2017/9/22
N2 - The cost of maintaining an application code would significantly increase if the application code is branched into multiple versions, each of which is optimized for a different architecture. In this work, default and vector versions of a realworld application code are refactored to be a single version, and the differences between the versions are expressed as userdefined code transformations. As a result, application developers can maintain only the single version, and transform it to its vector version just before the compilation. Although code optimizations for a vector processor are sometimes different from those for other processors, application developers can enjoy the performance of the vector processor without increasing the code complexity. Evaluation results demonstrate that vectorizationaware loop optimization for a vector processor can be expressed as user-defined code transformation rules, and thereby significantly improve the performance of a vector processor without major code modifications.
AB - The cost of maintaining an application code would significantly increase if the application code is branched into multiple versions, each of which is optimized for a different architecture. In this work, default and vector versions of a realworld application code are refactored to be a single version, and the differences between the versions are expressed as userdefined code transformations. As a result, application developers can maintain only the single version, and transform it to its vector version just before the compilation. Although code optimizations for a vector processor are sometimes different from those for other processors, application developers can enjoy the performance of the vector processor without increasing the code complexity. Evaluation results demonstrate that vectorizationaware loop optimization for a vector processor can be expressed as user-defined code transformation rules, and thereby significantly improve the performance of a vector processor without major code modifications.
KW - User-defined code transformaiton
KW - Vectorization-Aware loop optimization
KW - Xevolver
UR - http://www.scopus.com/inward/record.url?scp=85032627266&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85032627266&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2017.102
DO - 10.1109/CLUSTER.2017.102
M3 - Conference contribution
AN - SCOPUS:85032627266
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 685
EP - 692
BT - Proceedings - 2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
Y2 - 5 September 2017 through 8 September 2017
ER -