Online Approximate String Matching (1996)

This is the code corresponding to the paper "Improving an Algorithm for Approximate Pattern Matching", by R. Baeza-Yates and G. Navarro.

It is a gzipped tar with the necessary explanations in a README file inside. It is compiled code for SunOS, no source code.

This code can be freely used and distributed for teaching and research purposes, provided it is not modified and this notice is kept attached. No part of this code can be used for direct or indirect commercial advantage without written permission of the authors.

Here is the code.

NR-grep: Fast and Flexible Text Searching (2000)

This is the code corresponding to the paper "NR-grep: A Fast and Flexible Pattern Matching Tool", by G. Navarro.

Get the paper that explains all the aspects of the software and has appeared in Software Practice & Experience.

Get the code (version 1.1.2), which is a gzipped tar with the necessary explanations in a README file.
It is source code distributed under a GNU License. The code has been tested on Solaris and Linux. No guarantees, though.

Want older versions? nrgrep version 1.0 nrgrep version 1.1. nrgrep version 1.1.1.

JTV: Java Turing Visual (2002)

This is a graphical interface to simulate Turing Machines, using the abbreviated notation of Lewis and Papadimitriou. It is the undergraduate thesis of Marco Mora, a student of mine. It can work on English too.

This is the official JTV page. Specific questions and comments to Marco Mora.

IXPN: Indexed XPath via Proximal Nodes (2002)

This is a demo of an index based on the Proximal Nodes model, which solves XPath queries. It is the undergraduate thesis of Manuel Ortega, a student of mine. The demo uses the Shakespeare collection, which is widely available.

This is the official IXPN page. Specific questions and comments to Manuel Ortega.

LZ-index: A Text Index based on the Ziv-Lempel Trie (2003)

This is the code corresponding to the paper "Indexing Text with the Ziv Lempel Trie" and "Implementing the LZ-index: Theory and Practice", by G. Navarro.

Get either the paper already published in JDA or the full technical report, which explains all the aspects of the implementation and parts thereof have been submitted to ACM Journal of Experimental Algorithmics.

Get version 1.1 of the LZ-index, which is the current one. It differs from the original in a new rank implementation obtained with Rodrigo González, Szymon Grabowski and Veli Mäkinen, which made the index noticeably faster.

You can also get version 1.0, the original code tested in the JDA paper.

In the paper submitted to ACM JEA, a new version of the index is proposed that takes 80% of the space of the original index and is up to 3 times slower for short queries but almost similar on longer queries. Get the code of that version. It uses the improved rank function of version 1.1.

All codes come as gzipped tars with the necessary explanations in a README file and the copyright in a COPYRIGHT file.
They are source code that have been tested on Solaris and Linux.

The LZindex is compared against my particular implementation of the FMindex in the paper. Get that implementation if you want to play with it. It is a gzipped tar with the necessary explanations in a README file and the copyright in a COPYRIGHT file. It contains source code that has been tested on Solaris and Linux.
NOTE: This implementation follows the idea of Ferragina and Manzini's paper, however, it has been optimized to use much more space, so I could make a fair comparison against the LZindex, which is larger. You might instead be interested in their implementation.

Finally, you can get Sadakane's implementation of his Compressed Suffix Array I compared to in the paper.

LZgrep: A Direct Compressed Text Search Tool (2003 and 2005)

This is the code corresponding to the paper "LZgrep: A Boyer-Moore String Matching Tool for Ziv-Lempel Compressed Text", by G. Navarro and J. Tarhio. It contains improvements by Carlos Avendano Pérez.

Get the paper that explains all the aspects of the software and appeared in Software Practice & Experience.

Get the code (version 0.99), which is a gzipped tar with the copyright notice in a COPYRIGHT file. Build it with "make" and run "lzgrep --help" to start.
It is source code that has been tested on Solaris and Linux. Please see copyright details and (absence of) warranties in the COPYRIGHT file.
Note: option D-WM not yet implemented (works using D-SBOM). All the rest should work.

Pizza&Chili Site: Compressed Full-Text Indexing Corpus and Testbed (2005)

The Pizza&Chili Site contains free compressed full-text index implementations and testing test collections. It is a joint effort with Paolo Ferragina, University of Pisa.

Get the technical report where the current compressed indexes are surveyed, some of which are implemented in the site.