Optimization for iterative queries on MapReduce

Makoto Onizuka, Hiroyuki Kato, Soichiro Hidaka, Keisuke Nakano, Zhenjiang Hu

Research output: Contribution to journalConference articlepeer-review

11 Citations (Scopus)

Abstract

We propose OptIQ, a query optimization approach for iterative queries in distributed environment. OptIQ removes redundant computations among different iterations by extending the traditional techniques of view materialization and incremental view evaluation. First, OptIQ decomposes iterative queries into invariant and variant views, and materializes the former view. Redundant computations are removed by reusing the materialized view among iterations. Second, OptIQ incrementally evaluates the variant view, so that redundant computations are removed by skipping the evaluation on converged tuples in the variant view. We verify the effectiveness of OptIQ through the queries of PageRank and k-means clustering on real datasets. The results show that OptIQ achieves high efficiency, up to five times faster than is possible without removing the redundant computations among iterations.

Original languageEnglish
Pages (from-to)241-252
Number of pages12
JournalProceedings of the VLDB Endowment
Volume7
Issue number4
DOIs
Publication statusPublished - 2013 Dec
EventProceedings of the 40th International Conference on Very Large Data Bases, VLDB 2014 - Hangzhou, China
Duration: 2014 Sept 12014 Sept 5

Fingerprint

Dive into the research topics of 'Optimization for iterative queries on MapReduce'. Together they form a unique fingerprint.

Cite this