Space & Time Efficient Leapfrog Triejoin

Diego Arroyuelo, Daniela Campos, Adrián Gómez-Brandón, Gonzalo Navarro, Carlos Rojas, and Domagoj Vrgoc

Leapfrog Triejoin (LTJ) is arguably the most practical and popular worst-case-optimal (wco) algorithm for solving basic graph patterns in graph databases. Its main drawback is that it needs the database triples (subject, predicate, object) represented as paths in a trie, for each of the six orders of subject, predicate, and object. The resulting blowup in space makes most systems disregard LTJ or implement it only partially, and their corresponding algorithms be non-wco. In this paper we show that, by using compact data structures, it is possible to build an index that at the same matches the query time performance of the fastest classic wco index, and uses half the space of non-wco indices (which are much slower). Concretely, we make use of compact tree representations to store functional tries using one bit per trie edge, instead of one pointer. The resulting structure, called compactLTJ, uses 25% of the space of classic wco implementations and 45%-65% of classic non-wco systems. At solving queries, it is on par with the fastest classic wco system, and two orders of magnitude faster than non-wco systems. We further incorporate improved query resolution strategies into compactLTJ, which makes it considerably faster than any other alternative to display the first query results.