An improved self organizing map using jaccard new measure for textual bugs data clustering

In software projects there is a data repository which contains the bug reports. These bugs are required to carefully analyze to resolve the problem. Handling these bugs humanly is extremely time consuming process, and it can result the delaying in addressing some important bugs resolutions. To overc...

Full description

Bibliographic Details
Main Author: Ahmed, Attika
Format: Thesis
Language:English
English
English
Published: 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/7554/
http://eprints.uthm.edu.my/7554/1/24p%20ATTIKA%20AHMED.pdf
http://eprints.uthm.edu.my/7554/2/ATTIKA%20AHMED%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/7554/3/ATTIKA%20AHMED%20WATERMARK.pdf
_version_ 1848889136962863104
author Ahmed, Attika
author_facet Ahmed, Attika
author_sort Ahmed, Attika
building UTHM Institutional Repository
collection Online Access
description In software projects there is a data repository which contains the bug reports. These bugs are required to carefully analyze to resolve the problem. Handling these bugs humanly is extremely time consuming process, and it can result the delaying in addressing some important bugs resolutions. To overcome this problem researchers have been introduced many techniques. One of the techniques is the bug clustering. For the purpose of clustering, a variety of clustering algorithms available. One of the commonly used algorithm for bug clustering is K-means, which is considered a simplest unsupervised learning algorithm for clustering, yet it tends to produce smaller number of cluster. Considering the unsupervised learning algorithms, Self­Organizing Map (SOM) considers the equally compatible algorithm for clustering, as both algorithms are closely related but different in way they were used in data mining. This research attempts a comparative analysis of both the clustering algorithms and for attaining the results, a series of experiment has been conducted using Mozilla bugs data set. To address the data sparseness issue, the experiment has been perfonned on textual bugs' data by using two different distance measure which are Euclidean distance and Jaccard New Measure. The research results suggested that SOM has a limitation of poor perfonnance on sparse data set. Thus, the research introduced the improved SOM algorithm by using a Jaccard NM (SOM-JNM). The SOM-JNM produced significantly better results therefore; it can be consider a challenging approach to address the sparse data problems.
first_indexed 2025-11-15T20:21:23Z
format Thesis
id uthm-7554
institution Universiti Tun Hussein Onn Malaysia
institution_category Local University
language English
English
English
last_indexed 2025-11-15T20:21:23Z
publishDate 2018
recordtype eprints
repository_type Digital Repository
spelling uthm-75542022-08-21T01:45:13Z http://eprints.uthm.edu.my/7554/ An improved self organizing map using jaccard new measure for textual bugs data clustering Ahmed, Attika QA Mathematics QA1-43 General In software projects there is a data repository which contains the bug reports. These bugs are required to carefully analyze to resolve the problem. Handling these bugs humanly is extremely time consuming process, and it can result the delaying in addressing some important bugs resolutions. To overcome this problem researchers have been introduced many techniques. One of the techniques is the bug clustering. For the purpose of clustering, a variety of clustering algorithms available. One of the commonly used algorithm for bug clustering is K-means, which is considered a simplest unsupervised learning algorithm for clustering, yet it tends to produce smaller number of cluster. Considering the unsupervised learning algorithms, Self­Organizing Map (SOM) considers the equally compatible algorithm for clustering, as both algorithms are closely related but different in way they were used in data mining. This research attempts a comparative analysis of both the clustering algorithms and for attaining the results, a series of experiment has been conducted using Mozilla bugs data set. To address the data sparseness issue, the experiment has been perfonned on textual bugs' data by using two different distance measure which are Euclidean distance and Jaccard New Measure. The research results suggested that SOM has a limitation of poor perfonnance on sparse data set. Thus, the research introduced the improved SOM algorithm by using a Jaccard NM (SOM-JNM). The SOM-JNM produced significantly better results therefore; it can be consider a challenging approach to address the sparse data problems. 2018-01 Thesis NonPeerReviewed text en http://eprints.uthm.edu.my/7554/1/24p%20ATTIKA%20AHMED.pdf text en http://eprints.uthm.edu.my/7554/2/ATTIKA%20AHMED%20COPYRIGHT%20DECLARATION.pdf text en http://eprints.uthm.edu.my/7554/3/ATTIKA%20AHMED%20WATERMARK.pdf Ahmed, Attika (2018) An improved self organizing map using jaccard new measure for textual bugs data clustering. Masters thesis, Universiti Tun Hussein Onn Malaysia.
spellingShingle QA Mathematics
QA1-43 General
Ahmed, Attika
An improved self organizing map using jaccard new measure for textual bugs data clustering
title An improved self organizing map using jaccard new measure for textual bugs data clustering
title_full An improved self organizing map using jaccard new measure for textual bugs data clustering
title_fullStr An improved self organizing map using jaccard new measure for textual bugs data clustering
title_full_unstemmed An improved self organizing map using jaccard new measure for textual bugs data clustering
title_short An improved self organizing map using jaccard new measure for textual bugs data clustering
title_sort improved self organizing map using jaccard new measure for textual bugs data clustering
topic QA Mathematics
QA1-43 General
url http://eprints.uthm.edu.my/7554/
http://eprints.uthm.edu.my/7554/1/24p%20ATTIKA%20AHMED.pdf
http://eprints.uthm.edu.my/7554/2/ATTIKA%20AHMED%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/7554/3/ATTIKA%20AHMED%20WATERMARK.pdf