Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)


This paper presents an effective scheme for clustering a huge data set using a PC cluster system, in which each PC is equipped with a commodity programmable graphics processing unit (GPU). The proposed scheme is devised to achieve three-level hierarchical parallel processing of massive data clustering. The divide-and-conquer approach to parallel data clustering is employed to perform the coarse-grain parallel processing by multiple PCs with a message passing mechanism. By taking advantage of the GPU's parallel processing capability, moreover, the proposed scheme can exploit two types of the fine-grain data parallelism at the different levels in the nearest neighbor search, which is the most computationally-intensive part of the data-clustering process. The performance of our scheme is discussed in comparison with that of the implementation entirely running on CPU. Experimental results clearly show that the proposed hierarchial parallel processing can remarkably accelerate the data clustering task. Especially, GPU co-processing is quite effective to improve the computational efficiency of parallel data clustering on a PC cluster. Although data-transfer from GPU to CPU is generally costly, acceleration by GPU co-processing is significant to save the total execution time of data-clustering.

Original languageEnglish
Pages (from-to)219-234
Number of pages16
JournalJournal of Supercomputing
Issue number3
Publication statusPublished - 2006 Jun


  • General-purpose computation on GPU (GPGPU)
  • PC cluster
  • Programmable graphics processing unit (GPU)
  • The divide-and-conquer approach
  • k-means data clustering

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Software
  • Information Systems
  • Hardware and Architecture


Dive into the research topics of 'Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing'. Together they form a unique fingerprint.

Cite this