The HeightBL Algorithm for Bulk-loading F-Onion-trees

  • Arthur Emanuel de Oliveira Carosia University of São Paulo
  • Ricardo Rodrigues Ciferri UFSCAR
  • Cristina Dutra de Aguiar Ciferri University of São Paulo
Keywords: metric access method, similarity search, bulk-loading, Onion-tree, F-Onion-tree

Abstract

The F-Onion-tree is a robust access method that slices the metric space into disjoint subspaces to provide quick indexing of complex data in the main memory. However, the F-Onion-tree only performs element-by-element insertions into its structure, i.e. it does not introduce a technique to build the index considering all elements of the dataset at once. In this article, we fill this gap. We propose the HeightBL algorithm for bulk-loading F-Onion-trees. Performance tests with real-world data with different volumes and dimensionalities showed that the index produced by the HeightBL algorithm is very compact. Compared with the element-by-element insertion, the size of the index reduced from 53.42% to 71.25%. The experiments also showed that the HeightBL algorithm significantly improved range and k-NN query processing performance. It required from 13.38% up to 99.94% less distance calculations and was from 8.57% up to 99.04% faster than the element-by-element insertion.

Published
2014-09-28
Section
SBBD Articles