Nearest Neighbor Queries with Counting Aggregate-based Conditions

Authors

  • Daniel S Kaster UEL
  • Willian D Oliveira ICMC/USP
  • Renato Bueno UFSCAR
  • Agma J M Traina ICMC/USP
  • Caetano Traina Jr. ICMC/USP

Keywords:

Metric Access Methods, Nearest Neighbor Search, Similarity Queries

Abstract

The amount of complex data, such as images, videos and time series, has been increasing in the past few years in a very fast pace. The main property employed to query such kind of data based on its content is the similarity between elements. One of the most common similarity queries is the k-Nearest Neighbor query (k-NNq), which returns the k elements most similar to a given reference element. Although this kind of query is useful for many applications, it does not consider additional conditions that may modify the basic nearest neighbor search, which would allow retrieving more relevant information to the user. Complex data are usually associated with other information and usually have metadata describing the data content and/or context. Such external (non-content-based) information can be employed to define the conditions modifying the k-NN search to answer advanced queries over complex data. This paper presents a new variation to the k-NNq, which includes counting aggregate-based conditions.
This extension allows answering queries that are not easily definable using only external conditions and regular k-NN operation. Our work defines the counting aggregate-based conditions and presents algorithms to execute k-NN queries with these conditions, using either sequential scan or a metric access method.
Experiments over real datasets are also presented, comparing the developed algorithms and showing that search performance can be largely improved using an appropriate access method.

Downloads

Download data is not yet available.

Author Biography

Daniel S Kaster, UEL

Daniel S. Kaster received the B.Sc. degree in Computer Science from the University of Londrina, Brazil, in 1998 and the M.Sc. degree in Computer Science from the University of Campinas, Brazil, in 2001, under supervision of Prof. Claudia Bauzer Medeiros. He is currently a Lecturer with the Computer Science Department of the University of Londrina, Brazil, and a Ph.D. candidate in Computer Science from the University of São Paulo at São Carlos, Brazil, under supervision of Prof. Caetano Traina Jr. His research interests include searching complex data and multimedia and geographic databases.

Downloads

Published

2011-09-13

Issue

Section

SBBD Articles