qCube: Efficient integration of range query operators over a high dimension data cube
Keywords: Data Cube, High Dimension, Inquire Query, OLAP, Range Query.
AbstractMany decision support tasks involve range query operators such as Similar, Not Equal, Between, Greater or Less than and Some. Traditional cube approaches only use Equal operator in their summarized queries. Recent cube approaches implement range query operators, but they suffer from dimensionality problem, where a linear dimension increase consumes exponential storage space and runtime. Frag-Cubing and its extension, using bitmap index, are the most promising sequential solutions for high dimension data cubes, but they implement only Equal and Sub-cube query operators. In this paper, we implement a new high dimension sequential range cube approach, named Range Query Cube or just qCube. The qCube implements Equal, Not Equal, Distinct, Sub-cube, Greater or Less than, Some, Between, Similar and Top-k Similar query operators over a high dimension data cube. Comparative tests with qCube and Frag-Cubing use relations with 20, 30 or 60 dimensions, 5k distinct values on each dimension and 10 million tuples. In general, qCube has similar behavior when compared with Frag-Cubing, but it is faster to answer point and inquire queries. Frag-Cubing could not answer inquire queries with more than two Sub-cube operators in a relation with 30 dimensions, 5k cardinality and 10M tuples. In addition, qCube efficiently answered inquire queries from such a relation using six Sub-cube or Distinct operators. In general, complex queries with 30 operators, combining point, range and inquire operators, took less than 10 seconds to be answered by qCube. A massive qCube with 60 dimensions, 5k cardinality on each dimension and 100M tuples answered queries with five range operators, ten point operators and one inquire operator in less than 2 minutes.