Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set

One of the common e-commerce problems is the low purchase conversion rate. Data mining techniques can help tackle the problem by analysing and predicting the customer purchase intention to give better service and better recommendations to customers. In this project, the realtime online shoppers purc...

Full description

Bibliographic Details
Main Author: Yap, Chau Tean
Format: Final Year Project / Dissertation / Thesis
Published: 2022
Subjects:
Online Access:http://eprints.utar.edu.my/4990/
http://eprints.utar.edu.my/4990/1/YAP_CHAU_TEAN_2000681.pdf
_version_ 1848886296700780544
author Yap, Chau Tean
author_facet Yap, Chau Tean
author_sort Yap, Chau Tean
building UTAR Institutional Repository
collection Online Access
description One of the common e-commerce problems is the low purchase conversion rate. Data mining techniques can help tackle the problem by analysing and predicting the customer purchase intention to give better service and better recommendations to customers. In this project, the realtime online shoppers purchasing intention data set from Sakar et al. (2018) was used. The data set is unbalanced as it consists of 15.5% of the positive class and 84.5% of the negative class. Weka, a data mining tool, provides the facility to classify the data set with different machine learning algorithms. Six machine learning algorithms were applied and compared based on the classification evaluation methods. The algorithms involved were K-Nearest Neighbor (KNN), Naïve Bayers, J48, Support Vector Machine (SVM), Sequential Minimal Optimization (SMO) and Multilayer Perceptron (MLP). Data pre-processing on the data set may improve the classification results. The methods used were over-sampling, under-sampling and hybrid sampling, which modified the data set class distribution to achieve a better result. The hybrid sampling method gave comparable classification results compared with Sakar et al. (2018). Ensemble learning methods AdaBoost and Bagging were tested but showed no improvement on this online shoppers purchasing intention data set.
first_indexed 2025-11-15T19:36:15Z
format Final Year Project / Dissertation / Thesis
id utar-4990
institution Universiti Tunku Abdul Rahman
institution_category Local University
last_indexed 2025-11-15T19:36:15Z
publishDate 2022
recordtype eprints
repository_type Digital Repository
spelling utar-49902022-12-26T09:40:53Z Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set Yap, Chau Tean T Technology (General) One of the common e-commerce problems is the low purchase conversion rate. Data mining techniques can help tackle the problem by analysing and predicting the customer purchase intention to give better service and better recommendations to customers. In this project, the realtime online shoppers purchasing intention data set from Sakar et al. (2018) was used. The data set is unbalanced as it consists of 15.5% of the positive class and 84.5% of the negative class. Weka, a data mining tool, provides the facility to classify the data set with different machine learning algorithms. Six machine learning algorithms were applied and compared based on the classification evaluation methods. The algorithms involved were K-Nearest Neighbor (KNN), Naïve Bayers, J48, Support Vector Machine (SVM), Sequential Minimal Optimization (SMO) and Multilayer Perceptron (MLP). Data pre-processing on the data set may improve the classification results. The methods used were over-sampling, under-sampling and hybrid sampling, which modified the data set class distribution to achieve a better result. The hybrid sampling method gave comparable classification results compared with Sakar et al. (2018). Ensemble learning methods AdaBoost and Bagging were tested but showed no improvement on this online shoppers purchasing intention data set. 2022 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/4990/1/YAP_CHAU_TEAN_2000681.pdf Yap, Chau Tean (2022) Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set. Master dissertation/thesis, UTAR. http://eprints.utar.edu.my/4990/
spellingShingle T Technology (General)
Yap, Chau Tean
Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set
title Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set
title_full Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set
title_fullStr Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set
title_full_unstemmed Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set
title_short Predicting Customer Buying Decisions for Online Shopping with Unbalanced Data Set
title_sort predicting customer buying decisions for online shopping with unbalanced data set
topic T Technology (General)
url http://eprints.utar.edu.my/4990/
http://eprints.utar.edu.my/4990/1/YAP_CHAU_TEAN_2000681.pdf