Online MPI process mapping for coordinating locality and memory congestion on NUMA systems

Mulya Agung, Muhammad Alfian Amrizal, Ryusuke Egawa, Hiroyuki Takizawa

Research output: Contribution to journalArticlepeer-review


Mapping MPI processes to processor cores, called process mapping, is crucial to achieving the scalable performance on multi-core processors. By analyzing the communication behavior among MPI processes, process mapping can improve the communication locality, and thus reduce the overall communication cost. However, on modern non-uniform memory access (NUMA) systems, the memory congestion problem could degrade performance more severely than the locality problem because heavy congestion on shared caches and memory controllers could cause long latencies. Most of the existing work focus only on improving the locality or rely on offline profiling to analyze the communication behavior. We propose a process mapping method that dynamically performs the process mapping for adapting to communication behaviors while coordinating the locality and memory congestion. Our method works online during the execution of an MPI application. It does not require modifications to the application, previous knowledge of the communication behavior, or changes to the hardware and operating system. Experimental results show that our method can achieve performance and energy efficiency close to the best static mapping method with low overhead to the application execution. In experiments with the NAS parallel benchmarks on a NUMA system, the performance and total energy improvements are up to 34% (18.5% on average) and 28.9% (13.6% on average), respectively. In experiments with two GROMACS applications on a larger NUMA system, the average improvements in performance and total energy consumption are 21.6% and 12.6%, respectively.

Original languageEnglish
Pages (from-to)71-90
Number of pages20
JournalSupercomputing Frontiers and Innovations
Issue number1
Publication statusPublished - 2020 Jan 1


  • Communication
  • Congestion
  • Locality
  • MPI
  • Multi-core
  • NUMA
  • Process mapping


Dive into the research topics of 'Online MPI process mapping for coordinating locality and memory congestion on NUMA systems'. Together they form a unique fingerprint.

Cite this