Duplicate bug report detection using clustering
Bug reporting and fixing the reported bugs play a critical part in the development and maintenance of software systems. The software developers and end users can collaborate in this process to improve the reliability of software systems. Various end users report the defects they have found in the so...
| Main Authors: | , |
|---|---|
| Other Authors: | |
| Format: | Conference Paper |
| Published: |
IEEE
2014
|
| Subjects: | |
| Online Access: | http://hdl.handle.net/20.500.11937/11144 |
| _version_ | 1848747726585462784 |
|---|---|
| author | Gopalan, Raj Krishna, Aneesh |
| author2 | Jim Steel |
| author_facet | Jim Steel Gopalan, Raj Krishna, Aneesh |
| author_sort | Gopalan, Raj |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | Bug reporting and fixing the reported bugs play a critical part in the development and maintenance of software systems. The software developers and end users can collaborate in this process to improve the reliability of software systems. Various end users report the defects they have found in the software and how these bugs affect them. However, the same defect may be reported independently by several users leading to a significant number of duplicate bug reports. There are a number of existing methods for detecting duplicate bug reports, but the best results so far account for only 24% of actual duplicates. In this paper, we propose a new method based on clustering to identify a larger proportion of duplicate bug reports while keeping the false positives of misidentified non-duplicates low. The proposed approach is experimentally evaluated on a large sample of bug reports from three public domain data sets. The results show that this approach achieves better performance in terms of a harmonic measure that combines true positive and true negative rates when compared to the existing methods. |
| first_indexed | 2025-11-14T06:53:44Z |
| format | Conference Paper |
| id | curtin-20.500.11937-11144 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T06:53:44Z |
| publishDate | 2014 |
| publisher | IEEE |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-111442023-02-13T08:01:37Z Duplicate bug report detection using clustering Gopalan, Raj Krishna, Aneesh Jim Steel Liming Zhu bug report duplicate detection Bugzilla clustering Bug reporting and fixing the reported bugs play a critical part in the development and maintenance of software systems. The software developers and end users can collaborate in this process to improve the reliability of software systems. Various end users report the defects they have found in the software and how these bugs affect them. However, the same defect may be reported independently by several users leading to a significant number of duplicate bug reports. There are a number of existing methods for detecting duplicate bug reports, but the best results so far account for only 24% of actual duplicates. In this paper, we propose a new method based on clustering to identify a larger proportion of duplicate bug reports while keeping the false positives of misidentified non-duplicates low. The proposed approach is experimentally evaluated on a large sample of bug reports from three public domain data sets. The results show that this approach achieves better performance in terms of a harmonic measure that combines true positive and true negative rates when compared to the existing methods. 2014 Conference Paper http://hdl.handle.net/20.500.11937/11144 10.1109/ASWEC.2014.31 IEEE fulltext |
| spellingShingle | bug report duplicate detection Bugzilla clustering Gopalan, Raj Krishna, Aneesh Duplicate bug report detection using clustering |
| title | Duplicate bug report detection using clustering |
| title_full | Duplicate bug report detection using clustering |
| title_fullStr | Duplicate bug report detection using clustering |
| title_full_unstemmed | Duplicate bug report detection using clustering |
| title_short | Duplicate bug report detection using clustering |
| title_sort | duplicate bug report detection using clustering |
| topic | bug report duplicate detection Bugzilla clustering |
| url | http://hdl.handle.net/20.500.11937/11144 |