Charla IMFD: "Indexing Gemomic Databases".


Compartir
Charlista: 
Travis Gagie (Universidad Diego Portales)
Fecha: 
15 Marzo, 2019 - 12:00
Sala: 
Sala Philippe Flajolet (piso 3 Ed. Poniente - Beauchef 851)
Organización: 
Instituto Milenio Fundamentos de los Datos

 

Abstract:
Indexing Genomic Databases: Since the first human genomes were sequenced and assembled de novo nearly twenty years ago, hundreds of thousands of others have been sequenced and assembled by variation calling.  That much data is a valuable resource but also a problem for algorithms and data structures designed to handle megabytes or gigabytes but not terabytes or petabytes.  In this talk we take variation calling itself as an example and consider why it should become easier as genomic databases grow, why it has not so far, and why it will soon.  Specifically, we describe a new and scalable version of the FM-index data structure underlying modern DNA aligners.

 

Bio: Travis Gagie is currently an associate professor at the Universidad Diego Portales and a researcher at the Chilean Center for Biotechnology and Bioengineering, specializing in compressed data structures for bioinformatics.  He received a BSc in Cognitive Science from Queen's University, an MSc in Computer Science from the University of Toronto and a Dr. rer. nat. in Bioinformatics from Bielefeld University, Germany. Between his masters and doctorate he studied for a year at the National Research Center in Pisa and worked for two years at the University of Eastern Piedmont.  After graduating he worked as a post-doctoral researcher at the University of Chile, Aalto University and the University of Helsinki.  He has published over a hundred conference and journal papers, served on the committees of about two dozen conferences and workshops and recently co-chaired the 2018 International Symposium on String Processing and Information Retrieval (SPIRE).