Fast Regular Expression Search

Gonzalo Navarro and Mathieu Raffinot

We present a new algorithm to search regular expressions, which is able to skip text characters. The idea is to determine the minimum length l of a string matching the regular expression, manipulate the original automaton so that it recognizes all the reverse prefixes of length up to l of the strings accepted, and use it to skip text characters as done for exact string matching in previous work. As we show experimentally, the resulting algorithm is fast, the fastest one in many cases of interest.