The problem of efficient querying large amount of linked data using Map-Reduce is investigated in this paper. The proposed approach is based on the following assumptions: a) Data graphs are arbitrarily partitioned in the distributed file system is such a way that replication of data triples between the data segments is allowed. b) Data triples are replicated is such a way that answers to a special form of queries, called subject-object star queries, can be obtained from a single data segment. c) Each query posed by the user, can be transformed into a set of subject-object star subqueries. We propose a one and a half phase, scalable, Map-Reduce algorithm that efficiently computes the answers of the initial query by computing and appropriately combining the subquery answers. We prove that, under certain conditions, query can be answered in a single map-reduce phase.
Bibtex: Kalogeros et al. (2015)
Eleftherios Kalogeros, Manolis Gergatsoulis, and Matthew Damigos. Redundancy in linked data partitioning for efficient query evaluation. In Future Internet of Things and Cloud (FiCloud), 2015 3rd International Conference on, 497–504. IEEE, 2015. ↩