A Fast Heuristic for Approximate String Matching

Ricardo Baeza-Yates and Gonzalo Navarro

We study a fast algorithm for on-line approximate string matching. It is based on a non-deterministic finite automaton, which is simulated using bit-parallelism. If the automaton does not fit in a computer word, we partition the problem into subproblems. We show experimentally that this algorithm is the fastest for typical text search. We also show which algorithms are the best in other cases, and derive the fastest known heuristic for on-line approximate string matching, when the pattern is not very large. The focus of this work is mainly practical.