Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis

This project elucidates the execution of machine learning algorithms for the purpose of credit card fraud detection. For this task the company Insider Technologies Limited have supplied real world data consisting of 12.7 million transactions and 18 feature columns. During the project 96 additional f...

Full description

Bibliographic Details
Main Author: Hätälä, Tomas Sebastian
Format: Dissertation (University of Nottingham only)
Language:English
Published: 2015
Subjects:
Online Access:https://eprints.nottingham.ac.uk/30805/
_version_ 1848794064764272640
author Hätälä, Tomas Sebastian
author_facet Hätälä, Tomas Sebastian
author_sort Hätälä, Tomas Sebastian
building Nottingham Research Data Repository
collection Online Access
description This project elucidates the execution of machine learning algorithms for the purpose of credit card fraud detection. For this task the company Insider Technologies Limited have supplied real world data consisting of 12.7 million transactions and 18 feature columns. During the project 96 additional features have been added. The performances of five feature subsets were then evaluated using the classification algorithms Naïve Bayes, Logistic Regression, Support Vector Machines, k-Nearest Neighbours, Random Forest and Neural Networks. The feature subset containing only single transaction features hereby performed best. Moreover, the supplied data is highly imbalanced with only 335 transactions being marked as fraudulent. Therefore, different under- and oversampling algorithms have been applied to the single feature subset. The performances of the machine learning algorithms finally were evaluated using the sampled data. Most algorithms got a boost in performance when trained on the balanced data with Random Forest performing best, followed by Deep Learning. The datasets sampled using random undersampling, or a combination including random undersampling, hereby outperformed the others. It is concluded that this is because only random undersampling was able to achieve a complete class balance.
first_indexed 2025-11-14T19:10:15Z
format Dissertation (University of Nottingham only)
id nottingham-30805
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T19:10:15Z
publishDate 2015
recordtype eprints
repository_type Digital Repository
spelling nottingham-308052017-10-19T15:06:47Z https://eprints.nottingham.ac.uk/30805/ Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis Hätälä, Tomas Sebastian This project elucidates the execution of machine learning algorithms for the purpose of credit card fraud detection. For this task the company Insider Technologies Limited have supplied real world data consisting of 12.7 million transactions and 18 feature columns. During the project 96 additional features have been added. The performances of five feature subsets were then evaluated using the classification algorithms Naïve Bayes, Logistic Regression, Support Vector Machines, k-Nearest Neighbours, Random Forest and Neural Networks. The feature subset containing only single transaction features hereby performed best. Moreover, the supplied data is highly imbalanced with only 335 transactions being marked as fraudulent. Therefore, different under- and oversampling algorithms have been applied to the single feature subset. The performances of the machine learning algorithms finally were evaluated using the sampled data. Most algorithms got a boost in performance when trained on the balanced data with Random Forest performing best, followed by Deep Learning. The datasets sampled using random undersampling, or a combination including random undersampling, hereby outperformed the others. It is concluded that this is because only random undersampling was able to achieve a complete class balance. 2015-12-10 Dissertation (University of Nottingham only) NonPeerReviewed application/pdf en https://eprints.nottingham.ac.uk/30805/1/MIT-dissert_85_Tomas_H%C3%A4t%C3%A4l%C3%A4_2015-PRIZE-WINNER.pdf Hätälä, Tomas Sebastian (2015) Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis. [Dissertation (University of Nottingham only)] Credit card fraud detection machine learning classification feature engineering unbalancing.
spellingShingle Credit card fraud detection
machine learning classification
feature engineering
unbalancing.
Hätälä, Tomas Sebastian
Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
title Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
title_full Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
title_fullStr Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
title_full_unstemmed Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
title_short Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
title_sort prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
topic Credit card fraud detection
machine learning classification
feature engineering
unbalancing.
url https://eprints.nottingham.ac.uk/30805/