Improving hand written digit recognition using hybrid feature selection algorithm

In the field of machine learning, handwritten digit recognition was known as one of the crucial problems for pattern recognition and computer vision applications. There were a few applications of handwritten digit recognition, which include recognizing the digits on a utility map, zip code on a post...

Full description

Bibliographic Details
Main Author: Wong, Khye Mun
Format: Final Year Project / Dissertation / Thesis
Published: 2022
Subjects:
Online Access:http://eprints.utar.edu.my/4940/
http://eprints.utar.edu.my/4940/1/fyp_2022_SC_WKM.pdf
_version_ 1848886282451681280
author Wong, Khye Mun
author_facet Wong, Khye Mun
author_sort Wong, Khye Mun
building UTAR Institutional Repository
collection Online Access
description In the field of machine learning, handwritten digit recognition was known as one of the crucial problems for pattern recognition and computer vision applications. There were a few applications of handwritten digit recognition, which include recognizing the digits on a utility map, zip code on a postal mail, identifying bank check amount processing and many more. Offline handwritten digits have different traits, such as size, orientation, position, and thickness. Every individual’s handwriting was unique in such a way that it would increase the difficulty level of the classification process. High outline similarities between certain digits and overfitting issues for high dimensional data would further affect the computational time and cost. Therefore, many researchers have applied and developed various machine learning algorithms that could efficiently tackle the handwritten digit recognition problem. In this report, the main objective was to obtain the binary classification accuracy of handwritten digit recognition in a Multiple Feature dataset (MFEAT). Minimum Redundancy and Maximum Relevance (mRMR) was used as the primary approach in this report because, being a filter method, it had the greater advantage over a wrapped and embedded method. mRMR could save computational time and effectively considering the relevance of subset features and redundancy within the selected handwritten digit feature. While mRMR was capable of identifying a subset of features that were highly relevant to the targeted classification variable, it still carry the weakness of capturing redundant features along with the algorithm. Support Vector Machine Recursive Feature Elimination (SVM-RFE) as an embedded method, was selected as an alternative approach besides mRMR. SVM-RFE could further select the subset features based on ranking weights criterion, insignificant features with small ranking weights will be removed while retaining only significant features that have greater influence. However, RFE was flawed by the fact that those features selected by RFE were not ranked by importance albeit RFE could effectively eliminate the less important features and exclude redundant features. In view of their respective strength and deficiency, this study combined both these methods and used a support vector machine (SVM) as the underlying classifier anticipating the mRMR to make an excellent complement to the SVM-RFE. The hybrid method was exemplified in a binary classification between digits ‘4’ and ‘9’ from a multiple features dataset. The proposed hybrid method together with two extra predictive models, namely the mRMR and the SVM-RFE, were built for comparison. As a result, four significant features were shortlisted to achieve the highest accuracy which was 100.00% by using the proposed hybrid method. Apart from that, the proposed hybrid method was capable of selecting the highest test accuracy of 99.2% when only one feature was included. The result showed that the hybrid mRMR+SVM-RFE was better than both the sole SVM-mRMR and the sole SVM-RFE approaches in the sense that the hybrid approach achieved higher classification accuracy by using a smaller amount of features.
first_indexed 2025-11-15T19:36:01Z
format Final Year Project / Dissertation / Thesis
id utar-4940
institution Universiti Tunku Abdul Rahman
institution_category Local University
last_indexed 2025-11-15T19:36:01Z
publishDate 2022
recordtype eprints
repository_type Digital Repository
spelling utar-49402023-01-05T14:03:16Z Improving hand written digit recognition using hybrid feature selection algorithm Wong, Khye Mun HA Statistics Q Science (General) T Technology (General) In the field of machine learning, handwritten digit recognition was known as one of the crucial problems for pattern recognition and computer vision applications. There were a few applications of handwritten digit recognition, which include recognizing the digits on a utility map, zip code on a postal mail, identifying bank check amount processing and many more. Offline handwritten digits have different traits, such as size, orientation, position, and thickness. Every individual’s handwriting was unique in such a way that it would increase the difficulty level of the classification process. High outline similarities between certain digits and overfitting issues for high dimensional data would further affect the computational time and cost. Therefore, many researchers have applied and developed various machine learning algorithms that could efficiently tackle the handwritten digit recognition problem. In this report, the main objective was to obtain the binary classification accuracy of handwritten digit recognition in a Multiple Feature dataset (MFEAT). Minimum Redundancy and Maximum Relevance (mRMR) was used as the primary approach in this report because, being a filter method, it had the greater advantage over a wrapped and embedded method. mRMR could save computational time and effectively considering the relevance of subset features and redundancy within the selected handwritten digit feature. While mRMR was capable of identifying a subset of features that were highly relevant to the targeted classification variable, it still carry the weakness of capturing redundant features along with the algorithm. Support Vector Machine Recursive Feature Elimination (SVM-RFE) as an embedded method, was selected as an alternative approach besides mRMR. SVM-RFE could further select the subset features based on ranking weights criterion, insignificant features with small ranking weights will be removed while retaining only significant features that have greater influence. However, RFE was flawed by the fact that those features selected by RFE were not ranked by importance albeit RFE could effectively eliminate the less important features and exclude redundant features. In view of their respective strength and deficiency, this study combined both these methods and used a support vector machine (SVM) as the underlying classifier anticipating the mRMR to make an excellent complement to the SVM-RFE. The hybrid method was exemplified in a binary classification between digits ‘4’ and ‘9’ from a multiple features dataset. The proposed hybrid method together with two extra predictive models, namely the mRMR and the SVM-RFE, were built for comparison. As a result, four significant features were shortlisted to achieve the highest accuracy which was 100.00% by using the proposed hybrid method. Apart from that, the proposed hybrid method was capable of selecting the highest test accuracy of 99.2% when only one feature was included. The result showed that the hybrid mRMR+SVM-RFE was better than both the sole SVM-mRMR and the sole SVM-RFE approaches in the sense that the hybrid approach achieved higher classification accuracy by using a smaller amount of features. 2022-04 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/4940/1/fyp_2022_SC_WKM.pdf Wong, Khye Mun (2022) Improving hand written digit recognition using hybrid feature selection algorithm. Final Year Project, UTAR. http://eprints.utar.edu.my/4940/
spellingShingle HA Statistics
Q Science (General)
T Technology (General)
Wong, Khye Mun
Improving hand written digit recognition using hybrid feature selection algorithm
title Improving hand written digit recognition using hybrid feature selection algorithm
title_full Improving hand written digit recognition using hybrid feature selection algorithm
title_fullStr Improving hand written digit recognition using hybrid feature selection algorithm
title_full_unstemmed Improving hand written digit recognition using hybrid feature selection algorithm
title_short Improving hand written digit recognition using hybrid feature selection algorithm
title_sort improving hand written digit recognition using hybrid feature selection algorithm
topic HA Statistics
Q Science (General)
T Technology (General)
url http://eprints.utar.edu.my/4940/
http://eprints.utar.edu.my/4940/1/fyp_2022_SC_WKM.pdf