Picking adequate samples for approximate decision support queries using inverse SRSWOR
A simple random sample of records from a large data warehouse may not contain sufficient number of records that satisfy highly selective queries. Efficient sampling schemes for such queries involve using innovative techniques that can access records that are relevant to specific queries. In drawing...
| Main Authors: | , , |
|---|---|
| Other Authors: | |
| Format: | Conference Paper |
| Published: |
IJISCA
2012
|
| Subjects: | |
| Online Access: | http://hdl.handle.net/20.500.11937/16253 |
| _version_ | 1848749122961539072 |
|---|---|
| author | Rudra, Amit Gopalan, Raj Achuthan, Narasimaha |
| author2 | Ford Lumban Gaol |
| author_facet | Ford Lumban Gaol Rudra, Amit Gopalan, Raj Achuthan, Narasimaha |
| author_sort | Rudra, Amit |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | A simple random sample of records from a large data warehouse may not contain sufficient number of records that satisfy highly selective queries. Efficient sampling schemes for such queries involve using innovative techniques that can access records that are relevant to specific queries. In drawing the sample, it is advantageous to know what would be an adequate sample size for a given query. This paper proposes methods for picking adequate samples that ensure approximate query results with a desired level of accuracy. A special index based on a structure known as the k-MDI Tree is used to draw samples. An unbiased estimator named inverse simple random sampling without replacement is adapted to estimate adequate sample sizes for queries. The methods are evaluated experimentally on a large real life data set. The results of evaluation show that adequate sample sizes can be determined with errors in outputs of most queries within the acceptable limit of 5%. |
| first_indexed | 2025-11-14T07:15:55Z |
| format | Conference Paper |
| id | curtin-20.500.11937-16253 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T07:15:55Z |
| publishDate | 2012 |
| publisher | IJISCA |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-162532017-01-30T11:54:48Z Picking adequate samples for approximate decision support queries using inverse SRSWOR Rudra, Amit Gopalan, Raj Achuthan, Narasimaha Ford Lumban Gaol sampling data warehousing approximate query processing A simple random sample of records from a large data warehouse may not contain sufficient number of records that satisfy highly selective queries. Efficient sampling schemes for such queries involve using innovative techniques that can access records that are relevant to specific queries. In drawing the sample, it is advantageous to know what would be an adequate sample size for a given query. This paper proposes methods for picking adequate samples that ensure approximate query results with a desired level of accuracy. A special index based on a structure known as the k-MDI Tree is used to draw samples. An unbiased estimator named inverse simple random sampling without replacement is adapted to estimate adequate sample sizes for queries. The methods are evaluated experimentally on a large real life data set. The results of evaluation show that adequate sample sizes can be determined with errors in outputs of most queries within the acceptable limit of 5%. 2012 Conference Paper http://hdl.handle.net/20.500.11937/16253 IJISCA restricted |
| spellingShingle | sampling data warehousing approximate query processing Rudra, Amit Gopalan, Raj Achuthan, Narasimaha Picking adequate samples for approximate decision support queries using inverse SRSWOR |
| title | Picking adequate samples for approximate decision support queries using inverse SRSWOR |
| title_full | Picking adequate samples for approximate decision support queries using inverse SRSWOR |
| title_fullStr | Picking adequate samples for approximate decision support queries using inverse SRSWOR |
| title_full_unstemmed | Picking adequate samples for approximate decision support queries using inverse SRSWOR |
| title_short | Picking adequate samples for approximate decision support queries using inverse SRSWOR |
| title_sort | picking adequate samples for approximate decision support queries using inverse srswor |
| topic | sampling data warehousing approximate query processing |
| url | http://hdl.handle.net/20.500.11937/16253 |