| Summary: | To develop software projects, one of the major demands is high system functionality in
order to get over the complex system requirements. The ability to predict the probable
software system failures early can help organizations in making decisions about
possible solutions and improvements including engaging new experts and changing
project development plan. Inaccurate failure analysis could lead the software project
toward undesirable events. Therefore, to overcome this problem, this research focuses
on early software project failure prediction using different machine learning methods.
Furthermore, ensemble techniques are used to improve the model classification results,
as different classification abilities of their base single classifiers enable the proposed
algorithms to capture different characteristics of the training data and produce more
reliable and accurate classification This research aims to determine the factors behind
software project failures, in order to develop predictive models using ensemble methods
that use dataset constructed using historical data collected from past software projects
reports. A framework for software project failure prediction is proposed to obtain the
expected software project failure as well as project's failure probability. To obtain
reliable and accurate software failure prediction, we used an evidence- based approach
which depends on gathering information about successful and failed software project
from available resources such as reports, case studies and surveys. The first step of
developing the classification models is structuring a data set. Then, the constructed data
is partitioned into training and testing sets. The training data is used in different ways
to train the models while the testing data is used to measure their prediction
performance. After developing and testing the model, it can be deployed and used
during actual software projects development process to predict the future outcomes of
these projects. Initially, the predictive model is implemented using six of an existing
machine learning techniques in an attempt to achieve diversity. Furthermore, the
research proposed two machine learning ensemble approaches to enhance the
performance of the predictive models. The first proposed model uses the results of the
six implemented models to develop new ensemble model based on majority voting. The
second model proposes new approach that gives the higher rank (weight) to the base
classifier that showed higher performance in predicting the most difficult data than other
classifiers in the ensemble. Finally, the performance of the developed models is
compared using different measures such as confusion matrix, accuracy and sensitivity.
The results show that using the proposed weighted ensemble method for predicting
software project failures has better performance than other methods in terms of accuracy
(90%), sensitivity (92%) and other performance measures. However, the other
developed models appear fairly accurate and produce acceptable performance results.
This research began by identifying factors behind software project success and failures,
in order to develop accurate failure predictive models using different methods. This
research contributes to the field of software system development as it extracts software
project failure dataset and as it develops software project failure classification models
that can be generally applied on any software project during any phase of software
development process. Most of previously proposed classification models and tools were
developed and verified based on certain case studies. Furthermore, the research
proposes a new ensemble machine learning models to improve the failure prediction
performance. Finally, the research suggests that the proposed models can be integrated within the development process of the software system. This integration is realized
through developing evaluation tool to generate the failure probability of the project.
|