A prototype implementation of OpenCL for SX vector systems

Research output: Contribution to conferencePaperpeer-review

1 Citation (Scopus)

Abstract

OpenCL is a new programming specification whose current implementations are mostly used for high-performance computing with graphics processing units(GPUs), so-called GPU computing. However, the OpenCL specification itself is not specialized for GPU computing. In this research project, therefore, we propose to use the OpenCL specification to describe the collaborative work of scalar systems and an NEC SX vector supercomputing system. Since there is no OpenCL implementation for the SX systems, we translate a part of an OpenCL code written in OpenCL C to a standard C++ code. After the translation, the generated code is compiled with a native SX C++ compiler so as to produce an executable program that runs on the SX system. This paper shows a prototype implementation of an OpenCL-to-C translator to evaluate the potential of using the SX system for accelerating OpenCL applications. The evaluation results indicate that an SMP node can outperform a single GPU by improving the vectorization ratio, even though the benchmark programs are completely optimized for GPUs. In addition, as data parallelism is explicitly described in an OpenCL C code, the performance of the code generated by the OpenCL-to-C translator is scalable with the number of SX processors. Accordingly, the SMP node can be used as a very powerful accelerator with a huge memory space.

Original languageEnglish
Pages41-50
Number of pages10
DOIs
Publication statusPublished - 2012
Event2011 14th Teraflop Workshop - Stuttgart, Germany
Duration: 2011 Dec 52011 Dec 6

Conference

Conference2011 14th Teraflop Workshop
Country/TerritoryGermany
CityStuttgart
Period11/12/511/12/6

Fingerprint

Dive into the research topics of 'A prototype implementation of OpenCL for SX vector systems'. Together they form a unique fingerprint.

Cite this