TY - JOUR
T1 - Optimization for iterative queries on MapReduce
AU - Onizuka, Makoto
AU - Kato, Hiroyuki
AU - Hidaka, Soichiro
AU - Nakano, Keisuke
AU - Hu, Zhenjiang
PY - 2013/12
Y1 - 2013/12
N2 - We propose OptIQ, a query optimization approach for iterative queries in distributed environment. OptIQ removes redundant computations among different iterations by extending the traditional techniques of view materialization and incremental view evaluation. First, OptIQ decomposes iterative queries into invariant and variant views, and materializes the former view. Redundant computations are removed by reusing the materialized view among iterations. Second, OptIQ incrementally evaluates the variant view, so that redundant computations are removed by skipping the evaluation on converged tuples in the variant view. We verify the effectiveness of OptIQ through the queries of PageRank and k-means clustering on real datasets. The results show that OptIQ achieves high efficiency, up to five times faster than is possible without removing the redundant computations among iterations.
AB - We propose OptIQ, a query optimization approach for iterative queries in distributed environment. OptIQ removes redundant computations among different iterations by extending the traditional techniques of view materialization and incremental view evaluation. First, OptIQ decomposes iterative queries into invariant and variant views, and materializes the former view. Redundant computations are removed by reusing the materialized view among iterations. Second, OptIQ incrementally evaluates the variant view, so that redundant computations are removed by skipping the evaluation on converged tuples in the variant view. We verify the effectiveness of OptIQ through the queries of PageRank and k-means clustering on real datasets. The results show that OptIQ achieves high efficiency, up to five times faster than is possible without removing the redundant computations among iterations.
UR - http://www.scopus.com/inward/record.url?scp=84896956234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84896956234&partnerID=8YFLogxK
U2 - 10.14778/2732240.2732243
DO - 10.14778/2732240.2732243
M3 - Conference article
AN - SCOPUS:84896956234
SN - 2150-8097
VL - 7
SP - 241
EP - 252
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 4
T2 - Proceedings of the 40th International Conference on Very Large Data Bases, VLDB 2014
Y2 - 1 September 2014 through 5 September 2014
ER -