Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman

Requirement traceability can be considered as a measure of software quality to help achieve validation, verification, and reusability. Neglecting traceability leads to less maintainable software. Creating traceability links after-the-fact, known as traceability recovery, is a tedious and time-consum...

Full description

Bibliographic Details
Main Author: Mashahi Khalafalla , Dafaalla Abdelrahman
Format: Thesis
Published: 2020
Subjects:
Online Access:http://studentsrepo.um.edu.my/12935/
http://studentsrepo.um.edu.my/12935/2/Mashahi_Khalafalla.pdf
http://studentsrepo.um.edu.my/12935/1/Mashahi_Khalafalla.pdf
_version_ 1848774756506009600
author Mashahi Khalafalla , Dafaalla Abdelrahman
author_facet Mashahi Khalafalla , Dafaalla Abdelrahman
author_sort Mashahi Khalafalla , Dafaalla Abdelrahman
building UM Research Repository
collection Online Access
description Requirement traceability can be considered as a measure of software quality to help achieve validation, verification, and reusability. Neglecting traceability leads to less maintainable software. Creating traceability links after-the-fact, known as traceability recovery, is a tedious and time-consuming process when it is done manually. Therefore, information retrieval (IR) methods have been used to automatically identify traceability links between the artifacts. However, as a result of limitations of the software engineer and the IR techniques, the performance of the IR methods is negatively affected. There is no IR method that is able to recover traceability links between artifacts with high precision and high recall, such as in Vector Space Model (VSM), the retrieved false positives cause low precision results. Nevertheless, VSM is widely practiced as it considers the simplest linear algebraic method, easy to understand and use for non-IR experts. It allows ranking of documents concurring their probable relevance, and there are many tools and open-source implementations which implement VSM such as RETRO and ReqSimile. The research aims to assist software engineers (analysts) during the process of recovering traceability links between software artifacts by suggesting the appropriate type of phrases, which enhance the performance of IR method. The research objectives are: 1) To investigate IR methods for traceability recovery; 2) To propose a method that achieves high performance (as high recall and precision as possible) in traceability recovery; 3) To empirically validate the proposed method through an experimental analysis to demonstrate its ability to improve the performance (as high recall and precision as possible) in traceability recovery. A comparative experiment is done by extracting noun phrases (NP), verb phrases (VP), and combination of noun and verb phrases (NPVP) from three benchmarking datasets namely CM1, MODIS, and PINE. VSM is applied, the result is evaluated in terms of recall and precision and the result showed that indexing NP only tends to outperform VP, NPVP, and all terms by achieving high recall and precision as possible.
first_indexed 2025-11-14T14:03:22Z
format Thesis
id um-12935
institution University Malaya
institution_category Local University
last_indexed 2025-11-14T14:03:22Z
publishDate 2020
recordtype eprints
repository_type Digital Repository
spelling um-129352022-03-11T01:58:14Z Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman Mashahi Khalafalla , Dafaalla Abdelrahman QA76 Computer software TA Engineering (General). Civil engineering (General) Requirement traceability can be considered as a measure of software quality to help achieve validation, verification, and reusability. Neglecting traceability leads to less maintainable software. Creating traceability links after-the-fact, known as traceability recovery, is a tedious and time-consuming process when it is done manually. Therefore, information retrieval (IR) methods have been used to automatically identify traceability links between the artifacts. However, as a result of limitations of the software engineer and the IR techniques, the performance of the IR methods is negatively affected. There is no IR method that is able to recover traceability links between artifacts with high precision and high recall, such as in Vector Space Model (VSM), the retrieved false positives cause low precision results. Nevertheless, VSM is widely practiced as it considers the simplest linear algebraic method, easy to understand and use for non-IR experts. It allows ranking of documents concurring their probable relevance, and there are many tools and open-source implementations which implement VSM such as RETRO and ReqSimile. The research aims to assist software engineers (analysts) during the process of recovering traceability links between software artifacts by suggesting the appropriate type of phrases, which enhance the performance of IR method. The research objectives are: 1) To investigate IR methods for traceability recovery; 2) To propose a method that achieves high performance (as high recall and precision as possible) in traceability recovery; 3) To empirically validate the proposed method through an experimental analysis to demonstrate its ability to improve the performance (as high recall and precision as possible) in traceability recovery. A comparative experiment is done by extracting noun phrases (NP), verb phrases (VP), and combination of noun and verb phrases (NPVP) from three benchmarking datasets namely CM1, MODIS, and PINE. VSM is applied, the result is evaluated in terms of recall and precision and the result showed that indexing NP only tends to outperform VP, NPVP, and all terms by achieving high recall and precision as possible. 2020-07 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/12935/2/Mashahi_Khalafalla.pdf application/pdf http://studentsrepo.um.edu.my/12935/1/Mashahi_Khalafalla.pdf Mashahi Khalafalla , Dafaalla Abdelrahman (2020) Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman. Masters thesis, Universiti Malaya. http://studentsrepo.um.edu.my/12935/
spellingShingle QA76 Computer software
TA Engineering (General). Civil engineering (General)
Mashahi Khalafalla , Dafaalla Abdelrahman
Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman
title Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman
title_full Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman
title_fullStr Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman
title_full_unstemmed Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman
title_short Enhancing the performance of IR-based traceability recovery of requirement artifacts using noun phrases / Mashahi Khalafalla Dafaalla Abdelrahman
title_sort enhancing the performance of ir-based traceability recovery of requirement artifacts using noun phrases / mashahi khalafalla dafaalla abdelrahman
topic QA76 Computer software
TA Engineering (General). Civil engineering (General)
url http://studentsrepo.um.edu.my/12935/
http://studentsrepo.um.edu.my/12935/2/Mashahi_Khalafalla.pdf
http://studentsrepo.um.edu.my/12935/1/Mashahi_Khalafalla.pdf