Charla: “Compact and Succinct Data Structures -- From Theory to Practice”


Compartir
Charlista: 
Simon Gog, Karlsruhe Institute of Technology
Fecha: 
7 Octubre, 2015 - 16:00
Sala: 
Auditorio DCC N°315, Piso 3

 

Abstract:

For decades index structures where build on top of the data to enable users to efficiently carry out queries. For instance suffix trees or arrays were built on top of a text to answer pattern matching queries in a time complexity which is independent from the text length.

Unfortunately, these traditional pointer-based index structures often take significantly more space than the original data and therefore can not be used in scenarios where the data itself is not much smaller than the available main memory. In the last 25 years researches invented space-efficient counterparts for many index structures which use not much more space than the original data and have the same query complexity in theory. In this talk we will review popular examples of compact and succinct structures -- ranging from bit vectors over wavelet trees to compressed suffix trees -- and show how they can be easily used in applications by employing the Succinct Data Structure Library (SDSL). We will further show how to use the library's facilities to analyse, measure, and monitor time and space requirements of structures.   Finally, we will learn how more complex  structures can be composed and integrated in the existing framework.

SDSL:https://github.com/simongog/sdsl-lite

 

Bio

Simon Gog obtained his PhD from Ulm University where he was working on  space-efficient index data structures with applications in Bioinformatics. The Succinct Data Structure Library project was started in Ulm and continued at the University of Melbourne, where he was working as a PostDoc on compressed external index structures.   Currently, Simon is working at the Karlsruhe Institute of Technology on compact index structures for applications in the area of Information Retrieval.