Searching in Metric Spaces by Spatial Approximation

Gonzalo Navarro

We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them, which satisfies the triangular inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed to achieve this goal is the complexity measure. Our data structure, called sa-tree ("spatial approximation tree"), is based on approaching spatially the searched objects. We analyze our method and show that the number of distance evaluations to search among n objects is o(n). We show experimentally that the sa-tree is the best existing technique when the metric space is high-dimensional or the query has low selectivity. These are the most difficult cases in real applications.