Real time data loading and OLAP queries: Living together in next generation BI environments

Authors

  • Diego Pereira UNIRIO
  • Leonardo Guerreiro Azevedo UNIRIO
  • Asterio Tanaka UNIRIO
  • Fernanda Baião UNIRIO

Keywords:

Real Time Data Warehouse, Business Intelligence 2.0, Database Distribution, Database Fragmentation

Abstract

Real time ETL (Extraction, Transformation and Loading) of enterprise data is one of the foremost features of next generation Business Intelligence (BI 2.0). This article presents a proposal for loading operational data in real time using a Data Warehouse (DW) architecture with faster processing time than current approaches. Distributed processing techniques, such as data fragmentation on top of a shared-nothing architecture, are used to create fragments that are specialized in most current data and optimized to achieve real time insertions. Using this approach, the DW is updated near-line from operational data sources. As a result, DW queries are executed over real time data or very close to that. Moreover, real time loadings do not impact queries response time. In addition, we extended the Star Schema Benchmark to address loading operational data in real time. The extended benchmark was used to validate and demonstrate the efficiency of our approach, when compared to other in the literature. The experiments were performed in the CG-OLAP research project environment.

Downloads

Download data is not yet available.

Downloads

Additional Files

Published

2012-09-20