Register Flush-free Runahead Execution for Modern Vector Processors

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Modern vector processors have been designed to achieve high sustained performance, especially in HPC applications, because of their powerful instruction set oriented to data-level parallelism. Additionally, the latest vector processor adopts the out-of-order execution of the vector instructions to exploit instruction-level parallelism due to a significant gap in latency between vector arithmetic instructions and vector load/store instructions. In spite of the effort, this gap still brings a deterioration of sustained performance of the modern vector processors. This paper proposes a runahead execution mechanism for the modern vector processors to fill the latency gap by further exploiting instruction-level parallelism. If the processor stalls due to a long latency instruction, the conventional runahead execution mechanism changes the processor state from a normal mode to a runahead mode, and the processor speculatively executes the subsequent instructions that can cause stalls and their dependencies. However, the conventional runahead execution mechanisms flush the registers' values calculated in the runahead mode after finishing this mode and cannot reuse them in the subsequent normal mode. Since the vector processors have many values even in one vector register, these flushes and re-executions waste the bandwidth between cores and caches. Thus, to solve this problem of the conventional runahead mechanism, our proposed mechanism leaves the registers containing the results in the runahead mode in order for the processor to use the registers even after returning to the normal mode. For correctly using these registers after exiting the runahead mode, the proposed mechanism newly realizes functions to inherit the commit order information and the register aliasing information of the runahead-executed instructions into the normal mode. The evaluation results show that the proposed mechanism improves the performance by up to 20% and 3% on average by the conventional mechanism.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021
PublisherIEEE Computer Society
Pages114-125
Number of pages12
ISBN (Electronic)9781665443012
DOIs
Publication statusPublished - 2021
Event33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021 - Virtual, Online, Brazil
Duration: 2021 Oct 262021 Oct 29

Publication series

NameProceedings - Symposium on Computer Architecture and High Performance Computing
ISSN (Print)1550-6533

Conference

Conference33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021
Country/TerritoryBrazil
CityVirtual, Online
Period21/10/2621/10/29

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Register Flush-free Runahead Execution for Modern Vector Processors'. Together they form a unique fingerprint.

Cite this