Combining cluster quality index and supervised learning to predict students’ academic performance

Predicting students' academic performance can help the institution to take timely action, such as planning intervention measures to improve students’ academic achievement. This study aims to identify the main factors contributing to the postgraduate student’s academic performance. Preliminary p...

Full description

Bibliographic Details
Main Authors: Suhaila Zainudin, Rapi’ah Ibrahim, Hafiz Mohd Sarim
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2024
Online Access:http://journalarticle.ukm.my/23995/
http://journalarticle.ukm.my/23995/1/72%20-%2092.pdf
_version_ 1848815987584925696
author Suhaila Zainudin,
Rapi’ah Ibrahim,
Hafiz Mohd Sarim,
author_facet Suhaila Zainudin,
Rapi’ah Ibrahim,
Hafiz Mohd Sarim,
author_sort Suhaila Zainudin,
building UKM Institutional Repository
collection Online Access
description Predicting students' academic performance can help the institution to take timely action, such as planning intervention measures to improve students’ academic achievement. This study aims to identify the main factors contributing to the postgraduate student’s academic performance. Preliminary predictions can be made to avoid student dropouts, especially for students studying at the postgraduate level. The results obtained from this study are significant for facilitating the institution in decision-making and formulating the best strategies for the primary stakeholder (students). This study employs a combination of data mining tasks, such as clustering and classification, to undertake the prediction task. First, the approach performed clustering with K-Means algorithm to identifies different student groups. Then, the clusters were evaluated with cluster quality indexes, namely, the Silhouette Coefficient, Calinski-Harabasz Index and Davies-Bouldin Index, to determine the best clusters. The best number of clusters is selected based on the Silhouette Coefficient score because the uniformity for this coefficient is between -1 and 1. The best cluster is further analysed using classification to predict students’ academic performance. Three classification algorithms have been selected: Logistic Regression (LR), Support Vector Machine (SVM) and Decision Tree (DT). The results show that the LR model best predicts students’ academic performance levels compared to SVM and DT.
first_indexed 2025-11-15T00:58:43Z
format Article
id oai:generic.eprints.org:23995
institution Universiti Kebangasaan Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T00:58:43Z
publishDate 2024
publisher Penerbit Universiti Kebangsaan Malaysia
recordtype eprints
repository_type Digital Repository
spelling oai:generic.eprints.org:239952024-08-12T03:40:49Z http://journalarticle.ukm.my/23995/ Combining cluster quality index and supervised learning to predict students’ academic performance Suhaila Zainudin, Rapi’ah Ibrahim, Hafiz Mohd Sarim, Predicting students' academic performance can help the institution to take timely action, such as planning intervention measures to improve students’ academic achievement. This study aims to identify the main factors contributing to the postgraduate student’s academic performance. Preliminary predictions can be made to avoid student dropouts, especially for students studying at the postgraduate level. The results obtained from this study are significant for facilitating the institution in decision-making and formulating the best strategies for the primary stakeholder (students). This study employs a combination of data mining tasks, such as clustering and classification, to undertake the prediction task. First, the approach performed clustering with K-Means algorithm to identifies different student groups. Then, the clusters were evaluated with cluster quality indexes, namely, the Silhouette Coefficient, Calinski-Harabasz Index and Davies-Bouldin Index, to determine the best clusters. The best number of clusters is selected based on the Silhouette Coefficient score because the uniformity for this coefficient is between -1 and 1. The best cluster is further analysed using classification to predict students’ academic performance. Three classification algorithms have been selected: Logistic Regression (LR), Support Vector Machine (SVM) and Decision Tree (DT). The results show that the LR model best predicts students’ academic performance levels compared to SVM and DT. Penerbit Universiti Kebangsaan Malaysia 2024-06-01 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/23995/1/72%20-%2092.pdf Suhaila Zainudin, and Rapi’ah Ibrahim, and Hafiz Mohd Sarim, (2024) Combining cluster quality index and supervised learning to predict students’ academic performance. Asia-Pacific Journal of Information Technology and Multimedia, 13 (1). pp. 72-92. ISSN 2289-2192 https://www.ukm.my/apjitm
spellingShingle Suhaila Zainudin,
Rapi’ah Ibrahim,
Hafiz Mohd Sarim,
Combining cluster quality index and supervised learning to predict students’ academic performance
title Combining cluster quality index and supervised learning to predict students’ academic performance
title_full Combining cluster quality index and supervised learning to predict students’ academic performance
title_fullStr Combining cluster quality index and supervised learning to predict students’ academic performance
title_full_unstemmed Combining cluster quality index and supervised learning to predict students’ academic performance
title_short Combining cluster quality index and supervised learning to predict students’ academic performance
title_sort combining cluster quality index and supervised learning to predict students’ academic performance
url http://journalarticle.ukm.my/23995/
http://journalarticle.ukm.my/23995/
http://journalarticle.ukm.my/23995/1/72%20-%2092.pdf