Charla: Space-Efficient Data Structures: Smaller and Faster

Francisco Claude, -Faust, University of Waterloo, PhD Candidate
25 Abril, 2012 - 12:30
Auditorio DCC, 3er piso

*Space efficient data structures* provide fast operations in little space on structured data, such as bitmaps, trees, texts in natural language, structured texts (e.g., XML, RDF), binary relations (e.g., inverted indexes) and graphs (e.g., social graph of Facebook). One of the most successful space efficient data structure is the *Wavelet Tree*: it is used for indexing set of points, for compressing and indexing strings, and for compressing and indexing binary relations.

We propose to combine and extend those results to *grammar-compressed texts*, such as to produce an index proportional to the size of the grammar. For example, the Lempel-Ziv compression algorithm generates a grammar, and it is used in industrial tools such as compress. We show that our index supports random access and pattern matching in the text.

This work is done in collaboration with Jérémy Barbay, Susana Ladra, J. Ian Munro, Gonzalo Navarro, Patrick Nicholson, and Diego Seco.