FPGA-Based Scalable and Power-Efficient Fluid Simulation using Floating-Point DSP Blocks

Kentaro Sano, Satoru Yamamoto

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)


High-performance and low-power computation is required for large-scale fluid dynamics simulation. Due to the inefficient architecture and structure of CPUs and GPUs, they now have a difficulty in improving power efficiency for the target application. Although FPGAs become promising alternatives for power-efficient and high-performance computation due to their new architecture having floating-point (FP) DSP blocks, their relatively narrow memory bandwidth requires an appropriate way to fully exploit the advantage. This paper presents an architecture and design for scalable fluid simulation based on data-flow computing with a state-of-the-art FPGA. To exploit available hardware resources including FP DSPs, we introduce spatial and temporal parallelism to further scale the performance by adding more stream processing elements (SPEs) in an array. Performance modeling and prototype implementation allow us to explore the design space for both the existing Altera Arria10 and the upcoming Intel Stratix10 FPGAs. We demonstrate that Arria10 10AX115 FPGA achieves 519 GFlops at 9.67 GFlops/W only with a stream bandwidth of 9.0 GB/s, which is 97.9 percent of the peak performance of 18 implemented SPEs. We also estimate that Stratix10 FPGA can scale up to 6844 GFlops by combining spatial and temporal parallelism adequately.

Original languageEnglish
Article number7893769
Pages (from-to)2823-2837
Number of pages15
JournalIEEE Transactions on Parallel and Distributed Systems
Issue number10
Publication statusPublished - 2017 Oct 1


  • Custom computing machine
  • Floating-point
  • Fluid simulation
  • FPGA
  • High-performance computing
  • Stream computing


Dive into the research topics of 'FPGA-Based Scalable and Power-Efficient Fluid Simulation using Floating-Point DSP Blocks'. Together they form a unique fingerprint.

Cite this