Optimizing RPQs over a Compact Graph Representation

Diego Arroyuelo, Adrián Gómez-Brandón, Aidan Hogan, Gonzalo Navarro, and Javiel Rojas-Ledesma

We propose techniques to evaluate regular path queries (RPQs) over labeled graphs (e.g., RDF). We apply a bit-parallel simulation of a Glushkov automaton representing the query over a ring: a compact wavelet-tree-based index of the graph. To the best of our knowledge, our approach is the first to evaluate RPQs over a compact representation of such graphs, where we show the key advantages of using Glushkov automata in this setting. Our scheme obtains optimal time, in terms of alternation complexity, for traversing the product graph. We further introduce various optimizations, such as the ability to process several automaton states and graph nodes/labels simultaneously, and to estimate relevant selectivities. Experiments show that our approach uses 3-5x less space, and is over 5x faster, on average, than the next best state-of-the-art system for evaluating RPQs.