A hybrid model for discovering significant patterns in data mining

A significant pattern mining is one of the most important researches and a major concern in data mining. The significant patterns are very useful since it can reveal a new dimension of knowledge in certain domain applications. There are three categories of significant patterns named frequent p...

Full description

Bibliographic Details
Main Author: Abdullah, Zailani
Format: Thesis
Language:English
English
English
Published: 2012
Subjects:
Online Access:http://eprints.uthm.edu.my/2207/
http://eprints.uthm.edu.my/2207/1/24p%20ZAILANI%20ABDULLAH.pdf
http://eprints.uthm.edu.my/2207/2/ZAILANI%20ABDULLAH%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/2207/3/ZAILANI%20ABDULLAH%20WATERMARK.pdf
_version_ 1848887674150060032
author Abdullah, Zailani
author_facet Abdullah, Zailani
author_sort Abdullah, Zailani
building UTHM Institutional Repository
collection Online Access
description A significant pattern mining is one of the most important researches and a major concern in data mining. The significant patterns are very useful since it can reveal a new dimension of knowledge in certain domain applications. There are three categories of significant patterns named frequent patterns, least patterns and significant least patterns. Typically, these patterns may derive from the absolute frequent patterns or mixed up with the least patterns. In market-basket analysis, frequent patterns are considered as significant patterns and already make a lot of contribution. Frequent Pattern Tree (FP-Tree) is one of the famous data structure to deal with batched frequent patterns but it must rely on the original database. For detecting the exceptional occurrences or events that have a high implication such as unanticipated substances that cause air pollution, unexpected degree programs selected by students, unpredictable motorcycle models preferred by customers; the least patterns are very meaningful as compared to the frequent one. However, in this category of patterns, the generation of standard tree data structure may trigger the memory overflow due to the requirement of lowering the minimum support threshold. Furthermore, the classical support-confidence measure has many limitations such as tricky in choosing the right support-confidence value, misleading interpretation based on support-confidence combination and not scalable enough to deal with significant least patterns. Therefore, to overcome these drawbacks, in this thesis we proposed a Hybrid Model for Discovering Significant Patterns (Hy-DSP) which consist of the combination of Efficient Frequent Pattern Mining Model (EFP�M2), Efficient Least Pattern Mining Model (ELP-M2) and Significant Least Pattern Mining Model (SLP-M2). The proposed model is developed using the latest .NET framework and C# as a programming language. Experiments with the UCI datasets showed that the Hy-DSP which consist of DOSTrieIT and LP-Growth* outperformed the benchmarked CanTree and FP-Growth up to 4.13 times (75.78%) v and 10.37 times (90.31%), respectively, thus verify its efficiency. In fact, the number of patterns produce by the models is also less than the standard measures.
first_indexed 2025-11-15T19:58:08Z
format Thesis
id uthm-2207
institution Universiti Tun Hussein Onn Malaysia
institution_category Local University
language English
English
English
last_indexed 2025-11-15T19:58:08Z
publishDate 2012
recordtype eprints
repository_type Digital Repository
spelling uthm-22072021-10-31T04:06:28Z http://eprints.uthm.edu.my/2207/ A hybrid model for discovering significant patterns in data mining Abdullah, Zailani QA Mathematics QA76 Computer software A significant pattern mining is one of the most important researches and a major concern in data mining. The significant patterns are very useful since it can reveal a new dimension of knowledge in certain domain applications. There are three categories of significant patterns named frequent patterns, least patterns and significant least patterns. Typically, these patterns may derive from the absolute frequent patterns or mixed up with the least patterns. In market-basket analysis, frequent patterns are considered as significant patterns and already make a lot of contribution. Frequent Pattern Tree (FP-Tree) is one of the famous data structure to deal with batched frequent patterns but it must rely on the original database. For detecting the exceptional occurrences or events that have a high implication such as unanticipated substances that cause air pollution, unexpected degree programs selected by students, unpredictable motorcycle models preferred by customers; the least patterns are very meaningful as compared to the frequent one. However, in this category of patterns, the generation of standard tree data structure may trigger the memory overflow due to the requirement of lowering the minimum support threshold. Furthermore, the classical support-confidence measure has many limitations such as tricky in choosing the right support-confidence value, misleading interpretation based on support-confidence combination and not scalable enough to deal with significant least patterns. Therefore, to overcome these drawbacks, in this thesis we proposed a Hybrid Model for Discovering Significant Patterns (Hy-DSP) which consist of the combination of Efficient Frequent Pattern Mining Model (EFP�M2), Efficient Least Pattern Mining Model (ELP-M2) and Significant Least Pattern Mining Model (SLP-M2). The proposed model is developed using the latest .NET framework and C# as a programming language. Experiments with the UCI datasets showed that the Hy-DSP which consist of DOSTrieIT and LP-Growth* outperformed the benchmarked CanTree and FP-Growth up to 4.13 times (75.78%) v and 10.37 times (90.31%), respectively, thus verify its efficiency. In fact, the number of patterns produce by the models is also less than the standard measures. 2012-07 Thesis NonPeerReviewed text en http://eprints.uthm.edu.my/2207/1/24p%20ZAILANI%20ABDULLAH.pdf text en http://eprints.uthm.edu.my/2207/2/ZAILANI%20ABDULLAH%20COPYRIGHT%20DECLARATION.pdf text en http://eprints.uthm.edu.my/2207/3/ZAILANI%20ABDULLAH%20WATERMARK.pdf Abdullah, Zailani (2012) A hybrid model for discovering significant patterns in data mining. Masters thesis, Universiti Tun Hussein Onn Malaysia.
spellingShingle QA Mathematics
QA76 Computer software
Abdullah, Zailani
A hybrid model for discovering significant patterns in data mining
title A hybrid model for discovering significant patterns in data mining
title_full A hybrid model for discovering significant patterns in data mining
title_fullStr A hybrid model for discovering significant patterns in data mining
title_full_unstemmed A hybrid model for discovering significant patterns in data mining
title_short A hybrid model for discovering significant patterns in data mining
title_sort hybrid model for discovering significant patterns in data mining
topic QA Mathematics
QA76 Computer software
url http://eprints.uthm.edu.my/2207/
http://eprints.uthm.edu.my/2207/1/24p%20ZAILANI%20ABDULLAH.pdf
http://eprints.uthm.edu.my/2207/2/ZAILANI%20ABDULLAH%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/2207/3/ZAILANI%20ABDULLAH%20WATERMARK.pdf