Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View

Approximate query processing based on random sampling is one of the most useful methods for the efficient computation of large quantities of data kept in databases. However, small samples obtained through random sampling methods might lack the appropriate data relevant to query conditions because th...

Full description

Bibliographic Details
Main Authors:	Inoue, T., Krishna, Aneesh, Gopalan, Raj
Format:	Journal Article
Published:	2016
Online Access:	http://hdl.handle.net/20.500.11937/25472

_version_	1848751718647463936
author	Inoue, T. Krishna, Aneesh Gopalan, Raj
author_facet	Inoue, T. Krishna, Aneesh Gopalan, Raj
author_sort	Inoue, T.
building	Curtin Institutional Repository
collection	Online Access
description	Approximate query processing based on random sampling is one of the most useful methods for the efficient computation of large quantities of data kept in databases. However, small samples obtained through random sampling methods might lack the appropriate data relevant to query conditions because the samples do not adequately represent the entire dataset. The Multidimensional Cluster Sampling View has been proposed to support efficient and effective approximate query processing on common database tables. This view provides random sample records to be drawn from a database in SQL efficiently and effectively. The effectiveness of approximate query processing in this view was demonstrated on a large database table with only four dimensions. This differed from the usual number of dimensions in decision support systems, which is most commonly over ten. Therefore, further examinations and evaluations focusing on dimensionality, such as ten-dimensional data and over, are required in order to demonstrate its practicality. This paper evaluates whether the number of dimensions have an impact on the accuracy of the approximation and on the performance of the Multidimensional Cluster Sampling View. The results of the evaluation show that the effects of dimensionality are not visible.
first_indexed	2025-11-14T07:57:11Z
format	Journal Article
id	curtin-20.500.11937-25472
institution	Curtin University Malaysia
institution_category	Local University
last_indexed	2025-11-14T07:57:11Z
publishDate	2016
recordtype	eprints
repository_type	Digital Repository
spelling	curtin-20.500.11937-254722017-09-13T15:16:37Z Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View Inoue, T. Krishna, Aneesh Gopalan, Raj Approximate query processing based on random sampling is one of the most useful methods for the efficient computation of large quantities of data kept in databases. However, small samples obtained through random sampling methods might lack the appropriate data relevant to query conditions because the samples do not adequately represent the entire dataset. The Multidimensional Cluster Sampling View has been proposed to support efficient and effective approximate query processing on common database tables. This view provides random sample records to be drawn from a database in SQL efficiently and effectively. The effectiveness of approximate query processing in this view was demonstrated on a large database table with only four dimensions. This differed from the usual number of dimensions in decision support systems, which is most commonly over ten. Therefore, further examinations and evaluations focusing on dimensionality, such as ten-dimensional data and over, are required in order to demonstrate its practicality. This paper evaluates whether the number of dimensions have an impact on the accuracy of the approximation and on the performance of the Multidimensional Cluster Sampling View. The results of the evaluation show that the effects of dimensionality are not visible. 2016 Journal Article http://hdl.handle.net/20.500.11937/25472 10.17706/jsw.11.1.80-93 restricted
spellingShingle	Inoue, T. Krishna, Aneesh Gopalan, Raj Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
title	Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
title_full	Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
title_fullStr	Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
title_full_unstemmed	Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
title_short	Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View
title_sort	approximate query processing on high dimensionality database tables using multidimensional cluster sampling view
url	http://hdl.handle.net/20.500.11937/25472

Approximate Query Processing on High Dimensionality Database Tables Using Multidimensional Cluster Sampling View

Similar Items