Towards Recommendations for Horizontal XML Fragmentation
The large amount of XML data available on the web and inside organizations makes the performance of query processing a big concern. Several techniques can be applied to improve query processing performance, including indexing and data distribution. The increasing popularity of clouds, clusters and grids makes data distribution a feasible alternative. In these approaches, data is fragmented and distributed to several nodes, and queries submitted by users are processed in parallel, thus improving performance. However, the problem of how to fragment an XML database has not been adequately addressed. There are lots of definitions for XML fragments in the literature, but few proposals focus on how to use those definitions to actually fragment the database – a process called fragmentation design. Inspired by the relational and object-oriented models, which both have solid methodologies for database fragmentation design, the main objective of this article is to study and propose guidelines that could be used in a fragmentation design algorithm for XML databases, aiming at increasing query processing performance.