Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters

This study compares four machine learning algorithms Logistic Regression, Random Forest, Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) in water quality classification based on contaminant parameters. The purpose of this study is to evaluate and compare the performance of these algor...

Full description

Bibliographic Details
Main Authors: Teguh, Sutanto, Muhammad Rafli, Aditya, Haldi, Budiman, M.Rezqy, Noor Ridha, Usman, Syapotro, Noor, Azijah
Format: Article
Language:English
English
Published: INTI International University 2024
Subjects:
Online Access:http://eprints.intimal.edu.my/2047/
http://eprints.intimal.edu.my/2047/1/jods2024_48.pdf
http://eprints.intimal.edu.my/2047/2/588
_version_ 1848766906045038592
author Teguh, Sutanto
Muhammad Rafli, Aditya
Haldi, Budiman
M.Rezqy, Noor Ridha
Usman, Syapotro
Noor, Azijah
author_facet Teguh, Sutanto
Muhammad Rafli, Aditya
Haldi, Budiman
M.Rezqy, Noor Ridha
Usman, Syapotro
Noor, Azijah
author_sort Teguh, Sutanto
building INTI Institutional Repository
collection Online Access
description This study compares four machine learning algorithms Logistic Regression, Random Forest, Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) in water quality classification based on contaminant parameters. The purpose of this study is to evaluate and compare the performance of these algorithms in terms of accuracy. The methodology used includes data collection, preprocessing, and algorithm implementation with evaluation using crossvalidation techniques. The results showed that the application of the Stacking method with Gradient Boosting Meta-learner produced the highest accuracy of 96.00%, outperforming all other algorithms. In comparison, Random Forest achieved 95.75% accuracy, followed by SVM with 93.25% accuracy, and Logistic Regression and KNN each achieved 90.19% accuracy. This finding emphasizes that Stacking with Gradient Boosting provides much better performance in water quality classification compared to other models. This research provides new insights into the application of machine learning algorithms for water quality management as well as guidance for optimal algorithm selection.
first_indexed 2025-11-14T11:58:35Z
format Article
id intimal-2047
institution INTI International University
institution_category Local University
language English
English
last_indexed 2025-11-14T11:58:35Z
publishDate 2024
publisher INTI International University
recordtype eprints
repository_type Digital Repository
spelling intimal-20472024-11-26T06:15:07Z http://eprints.intimal.edu.my/2047/ Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters Teguh, Sutanto Muhammad Rafli, Aditya Haldi, Budiman M.Rezqy, Noor Ridha Usman, Syapotro Noor, Azijah QA75 Electronic computers. Computer science QA76 Computer software T Technology (General) This study compares four machine learning algorithms Logistic Regression, Random Forest, Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) in water quality classification based on contaminant parameters. The purpose of this study is to evaluate and compare the performance of these algorithms in terms of accuracy. The methodology used includes data collection, preprocessing, and algorithm implementation with evaluation using crossvalidation techniques. The results showed that the application of the Stacking method with Gradient Boosting Meta-learner produced the highest accuracy of 96.00%, outperforming all other algorithms. In comparison, Random Forest achieved 95.75% accuracy, followed by SVM with 93.25% accuracy, and Logistic Regression and KNN each achieved 90.19% accuracy. This finding emphasizes that Stacking with Gradient Boosting provides much better performance in water quality classification compared to other models. This research provides new insights into the application of machine learning algorithms for water quality management as well as guidance for optimal algorithm selection. INTI International University 2024-11 Article PeerReviewed text en cc_by_4 http://eprints.intimal.edu.my/2047/1/jods2024_48.pdf text en cc_by_4 http://eprints.intimal.edu.my/2047/2/588 Teguh, Sutanto and Muhammad Rafli, Aditya and Haldi, Budiman and M.Rezqy, Noor Ridha and Usman, Syapotro and Noor, Azijah (2024) Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters. Journal of Data Science, 2024 (48). pp. 1-7. ISSN 2805-5160 http://ipublishing.intimal.edu.my/jods.html
spellingShingle QA75 Electronic computers. Computer science
QA76 Computer software
T Technology (General)
Teguh, Sutanto
Muhammad Rafli, Aditya
Haldi, Budiman
M.Rezqy, Noor Ridha
Usman, Syapotro
Noor, Azijah
Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters
title Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters
title_full Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters
title_fullStr Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters
title_full_unstemmed Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters
title_short Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithm for Water Quality Classification Based on Contaminant Parameters
title_sort comparison of logistic regression, random forest, svm, knn algorithm for water quality classification based on contaminant parameters
topic QA75 Electronic computers. Computer science
QA76 Computer software
T Technology (General)
url http://eprints.intimal.edu.my/2047/
http://eprints.intimal.edu.my/2047/
http://eprints.intimal.edu.my/2047/1/jods2024_48.pdf
http://eprints.intimal.edu.my/2047/2/588