Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis
This project elucidates the execution of machine learning algorithms for the purpose of credit card fraud detection. For this task the company Insider Technologies Limited have supplied real world data consisting of 12.7 million transactions and 18 feature columns. During the project 96 additional f...
| Main Author: | |
|---|---|
| Format: | Dissertation (University of Nottingham only) |
| Language: | English |
| Published: |
2015
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/30805/ |
| _version_ | 1848794064764272640 |
|---|---|
| author | Hätälä, Tomas Sebastian |
| author_facet | Hätälä, Tomas Sebastian |
| author_sort | Hätälä, Tomas Sebastian |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | This project elucidates the execution of machine learning algorithms for the purpose of credit card fraud detection. For this task the company Insider Technologies Limited have supplied real world data consisting of 12.7 million transactions and 18 feature columns. During the project 96 additional features have been added.
The performances of five feature subsets were then evaluated using the classification algorithms Naïve Bayes, Logistic Regression, Support Vector Machines, k-Nearest Neighbours, Random Forest and Neural Networks. The feature subset containing only single transaction features hereby performed best.
Moreover, the supplied data is highly imbalanced with only 335 transactions being marked as fraudulent. Therefore, different under- and oversampling algorithms have been applied to the single feature subset.
The performances of the machine learning algorithms finally were evaluated using the sampled data. Most algorithms got a boost in performance when trained on the balanced data with Random Forest performing best, followed by Deep Learning. The datasets sampled using random undersampling, or a combination including random undersampling, hereby outperformed the others. It is concluded that this is because only random undersampling was able to achieve a complete class balance. |
| first_indexed | 2025-11-14T19:10:15Z |
| format | Dissertation (University of Nottingham only) |
| id | nottingham-30805 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-14T19:10:15Z |
| publishDate | 2015 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-308052017-10-19T15:06:47Z https://eprints.nottingham.ac.uk/30805/ Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis Hätälä, Tomas Sebastian This project elucidates the execution of machine learning algorithms for the purpose of credit card fraud detection. For this task the company Insider Technologies Limited have supplied real world data consisting of 12.7 million transactions and 18 feature columns. During the project 96 additional features have been added. The performances of five feature subsets were then evaluated using the classification algorithms Naïve Bayes, Logistic Regression, Support Vector Machines, k-Nearest Neighbours, Random Forest and Neural Networks. The feature subset containing only single transaction features hereby performed best. Moreover, the supplied data is highly imbalanced with only 335 transactions being marked as fraudulent. Therefore, different under- and oversampling algorithms have been applied to the single feature subset. The performances of the machine learning algorithms finally were evaluated using the sampled data. Most algorithms got a boost in performance when trained on the balanced data with Random Forest performing best, followed by Deep Learning. The datasets sampled using random undersampling, or a combination including random undersampling, hereby outperformed the others. It is concluded that this is because only random undersampling was able to achieve a complete class balance. 2015-12-10 Dissertation (University of Nottingham only) NonPeerReviewed application/pdf en https://eprints.nottingham.ac.uk/30805/1/MIT-dissert_85_Tomas_H%C3%A4t%C3%A4l%C3%A4_2015-PRIZE-WINNER.pdf Hätälä, Tomas Sebastian (2015) Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis. [Dissertation (University of Nottingham only)] Credit card fraud detection machine learning classification feature engineering unbalancing. |
| spellingShingle | Credit card fraud detection machine learning classification feature engineering unbalancing. Hätälä, Tomas Sebastian Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| title | Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| title_full | Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| title_fullStr | Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| title_full_unstemmed | Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| title_short | Prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| title_sort | prototypic implementation and comparison of fraud detection algorithms based on methods of statistical analysis |
| topic | Credit card fraud detection machine learning classification feature engineering unbalancing. |
| url | https://eprints.nottingham.ac.uk/30805/ |