In this paper, we exploit the widely used Map-Reduce framework and propose a generic two-phase, Map-Reduce algorithm for querying large amount of linked data. The algorithm is based on the idea that the data graph can be arbitrarily partitioned into graph segments which can be stored in different nodes of a cluster of commodity computers. To answer a user query Q, Q is also decomposed into a set of subqueries. In the first phase, the subqueries are applied to each graph segment, in isolation, and intermediate results are computed. The intermediate results are combined in the second phase to obtain the answers of the query Q. The proposed algorithm computes the answers to a given query correctly, independently of a) the data graph partitioning, b) how graph segments are stored, c) the query decomposition, and d) the algorithm used for calculating (partial) results.
Bibtex: Gergatsoulis et al. (2013)
Manolis Gergatsoulis, Christos Nomikos, Eleftherios Kalogeros, and Matthew Damigos. An algorithm for querying linked data using map-reduce. In Data Management in Cloud, Grid and P2P Systems, pages 51–62. Springer, 2013. ↩