Average Complexity of Exact and Approximate Multiple String Matching

Gonzalo Navarro and Kimmo Fredriksson

We show that the average number of characters examined to search for r random patterns of length m in a text of length n over a uniformly distributed alphabet of size s cannot be less than Omega(n*log_s(rm)/m). When we permit up to k insertions, deletions, and/or substitutions of characters in the occurrences of the patterns, the lower bound becomes Omega(n(k+log_s(rm))/m). This generalizes previous single-pattern lower bounds of Yao (for exact matching) and of Chang and Marr (for approximate matching), and proves the optimality of several existing multipattern search algorithms.