A Stitch in Time Saves Nine -- SPARQL Querying of Property Graphs using Gremlin Traversals

Tracking #: 1821-3034

This paper is currently under review
Authors: 
Harsh Vrajesh Thakkar
Dharmen Punjani
Yashwant Keswani
Jens Lehmann
Soeren Auer

Responsible editor: 
Claudia d'Amato

Submission type: 
Full Paper
Abstract: 
Knowledge graphs have become popular over the past years and frequently rely on the Resource Description Framework (RDF) or Property Graphs (PG) as underlying data models. However, the query languages for these two data models -- SPARQL for RDF and Gremlin for property graph traversal -- are lacking interoperability. We present Gremlinator, a novel SPARQL to Gremlin translator. Gremlinator translates SPARQL queries to Gremlin traversals for executing graph pattern matching queries over graph databases. This allows to access and query a wide variety of Graph Data Management Systems (DMS) using the W3C standardized SPARQL query language and avoid the learning curve of a new Graph Query Language. Gremlin is a system-agnostic traversal language covering both OLTP graph database or OLAP graph processors, thus making it a desirable choice for supporting interoperability wrt. querying Graph DMSs. We present a comprehensive empirical evaluation of Gremlinator and demonstrate its validity and applicability by executing SPARQL queries on top of the leading graph stores Neo4J, Sparksee, and Apache TinkerGraph and compare the performance with the RDF stores Virtuoso, 4Store, and JenaTDB. Our evaluation demonstrates the substantial performance gain obtained by the Gremlin counterparts of the SPARQL queries, especially for star-shaped and complex queries.
Full PDF Version: 
Tags: 
Under Review

Comments