Suffix Arrays in Parallel

Mauricio Marín and Gonzalo Navarro

Suffix arrays are powerful data structures for text indexing. Unlike the popular inverted files, suffix arrays do not depend on the existence of words in the text and are able of solving much more complex search problems. This makes them appealing in many scenarios, such as dealing with oriental languages and genomic and protein databases.

In this paper we present parallel algorithmis devised to increase throughput of suffix arrays on a multiple-query setting. Design and cost evaluatio is effected on top of the bulk-synchronous model of parallel computing and thereby they are independent of programming details and architecture of the parallel machine. Experimental results show that efficient performance is indeed feasible in this strongly sequential and very poor locality data structure.