DAHC-tree: An Effective Index for Approximate Search in High-Dimensional Metric Spaces

Authors

  • Jurandy Almeida University of Campinas
  • Eduardo Valle University of Campinas
  • Ricardo da S. Torres University of Campinas
  • Neucimar J. Leite University of Campinas

Keywords:

clustering methods, database indexing, metric access methods, metric spaces, similarity search

Abstract

Similarity search in high-dimensional metric spaces is a key operation in many applications, such as multimedia databases, image retrieval, object recognition, and others. The high dimensionality of the data requires special index structures to facilitate the search. A problem regarding the creation of suitable index structures for high-dimensional data is the relationship between the geometry of the data and the organization of an index structure. In this paper, we study the performance of a new index structure, called Divisive-Agglomerative Hierarchical Clustering tree (DAHC-tree), which reduces the effects imposed by the above liability. DAHC-tree is constructed by dividing and grouping the data set into compact clusters. We perform a rigorous experimental design and analyze the trade-offs involved in building such an index structure. Additionally, we present extensive experiments comparing our method against state-of-the-art of exact and approximate solutions. The conducted analysis and the reported comparative test results demonstrate that our technique significantly improves the performance of similarity queries.

Downloads

Download data is not yet available.

Downloads

Published

2010-09-09

Issue

Section

Regular Articles