Ascertaining data mining rules using statistical approaches

Knowledge acquisition techniques have been well researched in the data mining community. Such techniques, especially when used for unsupervised learning, often generate a large quantity of rules and patterns. While many rules generated are useful and interesting, some information is not captured by...

Full description

Bibliographic Details
Main Authors: Mohd Shaharanee, I., Dillon, Tharam S, Hadzic, Fedja
Other Authors: Parvinder S. Sandhu
Format: Conference Paper
Published: International Association of Computer Science and Information Technology (IACSIT) 2009
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/43700
_version_ 1848756778821484544
author Mohd Shaharanee, I.
Dillon, Tharam S
Hadzic, Fedja
author2 Parvinder S. Sandhu
author_facet Parvinder S. Sandhu
Mohd Shaharanee, I.
Dillon, Tharam S
Hadzic, Fedja
author_sort Mohd Shaharanee, I.
building Curtin Institutional Repository
collection Online Access
description Knowledge acquisition techniques have been well researched in the data mining community. Such techniques, especially when used for unsupervised learning, often generate a large quantity of rules and patterns. While many rules generated are useful and interesting, some information is not captured by those rules, such as already known patterns, coincidental patterns and patterns with no significant value for the real world applications. Sustaining the interestingness of rules generated by data mining algorithm is an active and important area of data mining research. Different methods have been proposed and have been well examined for discovering interestingness in rules. These measures often only reflect the interestingness with respect to the database being observed, and as such the rules will satisfy the constrains with respect to the sample data only, but not with respect to the whole data distribution. Therefore, one can still argue the usefulness of the rules and patterns with respect to their use in practical problems. As the data mining techniques are naturally data driven, it would benefit to affirm the generated hypothesis with a statistical methodology. In our research, we investigate how to combine data mining and statistical measurement techniques to arrive at more reliable and interesting set of rules. Such a combination is greatly essential to conquer the data overload in practical problems. A real world data set is used to explore the ways in which one can measure and verify the usefulness of rules from data mining techniques using statistical analysis.
first_indexed 2025-11-14T09:17:37Z
format Conference Paper
id curtin-20.500.11937-43700
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T09:17:37Z
publishDate 2009
publisher International Association of Computer Science and Information Technology (IACSIT)
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-437002017-10-02T02:26:56Z Ascertaining data mining rules using statistical approaches Mohd Shaharanee, I. Dillon, Tharam S Hadzic, Fedja Parvinder S. Sandhu Data mining significant rules statistical analysis Knowledge acquisition techniques have been well researched in the data mining community. Such techniques, especially when used for unsupervised learning, often generate a large quantity of rules and patterns. While many rules generated are useful and interesting, some information is not captured by those rules, such as already known patterns, coincidental patterns and patterns with no significant value for the real world applications. Sustaining the interestingness of rules generated by data mining algorithm is an active and important area of data mining research. Different methods have been proposed and have been well examined for discovering interestingness in rules. These measures often only reflect the interestingness with respect to the database being observed, and as such the rules will satisfy the constrains with respect to the sample data only, but not with respect to the whole data distribution. Therefore, one can still argue the usefulness of the rules and patterns with respect to their use in practical problems. As the data mining techniques are naturally data driven, it would benefit to affirm the generated hypothesis with a statistical methodology. In our research, we investigate how to combine data mining and statistical measurement techniques to arrive at more reliable and interesting set of rules. Such a combination is greatly essential to conquer the data overload in practical problems. A real world data set is used to explore the ways in which one can measure and verify the usefulness of rules from data mining techniques using statistical analysis. 2009 Conference Paper http://hdl.handle.net/20.500.11937/43700 International Association of Computer Science and Information Technology (IACSIT) restricted
spellingShingle Data mining
significant rules
statistical analysis
Mohd Shaharanee, I.
Dillon, Tharam S
Hadzic, Fedja
Ascertaining data mining rules using statistical approaches
title Ascertaining data mining rules using statistical approaches
title_full Ascertaining data mining rules using statistical approaches
title_fullStr Ascertaining data mining rules using statistical approaches
title_full_unstemmed Ascertaining data mining rules using statistical approaches
title_short Ascertaining data mining rules using statistical approaches
title_sort ascertaining data mining rules using statistical approaches
topic Data mining
significant rules
statistical analysis
url http://hdl.handle.net/20.500.11937/43700