A method for Arabic handwritten diacritics characters

An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to r...

Full description

Bibliographic Details
Main Authors: Abdullah, Muhamad Taufik, Alotaibi, Faiz, Azmi Murad, Masrah Azrifah, O.K. Rahmat, Rahmita Wirza, Abdullah, Rusli
Format: Article
Language:English
Published: Blue Eyes Intelligence Engineering & Sciences Publication 2019
Online Access:http://psasir.upm.edu.my/id/eprint/80428/
http://psasir.upm.edu.my/id/eprint/80428/1/ARABIC.pdf
_version_ 1848858903254663168
author Abdullah, Muhamad Taufik
Alotaibi, Faiz
Azmi Murad, Masrah Azrifah
O.K. Rahmat, Rahmita Wirza
Abdullah, Rusli
author_facet Abdullah, Muhamad Taufik
Alotaibi, Faiz
Azmi Murad, Masrah Azrifah
O.K. Rahmat, Rahmita Wirza
Abdullah, Rusli
author_sort Abdullah, Muhamad Taufik
building UPM Institutional Repository
collection Online Access
description An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to read and recognized Arabic diacritics characters as human still, remain a problem. More, specifically, poor recognition rate in most of optical diacritics characters recognition is mainly attributed to failing in segmenting a handwritten text correctly. To overcome this problem, we perform develop a method based on seven operations; it starts with searching the text-line height followed by reading words from the line. Then identify the diacritics regions. The segmentation is also applied during this operation by converting the text-line into a grayscale and binary image. Moreover, we introduced a new model based on k-nearest neighbors (KNN) algorithm to identify diacritics and characters segmentation. KNN is trained to directly predict the diacritic from the text-line. Finally, we offer an evaluation discussion on optical diacritics characters recognition.
first_indexed 2025-11-15T12:20:50Z
format Article
id upm-80428
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T12:20:50Z
publishDate 2019
publisher Blue Eyes Intelligence Engineering & Sciences Publication
recordtype eprints
repository_type Digital Repository
spelling upm-804282021-01-26T20:37:35Z http://psasir.upm.edu.my/id/eprint/80428/ A method for Arabic handwritten diacritics characters Abdullah, Muhamad Taufik Alotaibi, Faiz Azmi Murad, Masrah Azrifah O.K. Rahmat, Rahmita Wirza Abdullah, Rusli An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to read and recognized Arabic diacritics characters as human still, remain a problem. More, specifically, poor recognition rate in most of optical diacritics characters recognition is mainly attributed to failing in segmenting a handwritten text correctly. To overcome this problem, we perform develop a method based on seven operations; it starts with searching the text-line height followed by reading words from the line. Then identify the diacritics regions. The segmentation is also applied during this operation by converting the text-line into a grayscale and binary image. Moreover, we introduced a new model based on k-nearest neighbors (KNN) algorithm to identify diacritics and characters segmentation. KNN is trained to directly predict the diacritic from the text-line. Finally, we offer an evaluation discussion on optical diacritics characters recognition. Blue Eyes Intelligence Engineering & Sciences Publication 2019 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/80428/1/ARABIC.pdf Abdullah, Muhamad Taufik and Alotaibi, Faiz and Azmi Murad, Masrah Azrifah and O.K. Rahmat, Rahmita Wirza and Abdullah, Rusli (2019) A method for Arabic handwritten diacritics characters. International Journal of Engineering and Advanced Technology, 8 (6S3). pp. 209-212. ISSN 2249-8958 https://www.ijeat.org/wp-content/uploads/papers/v8i6S3/F10340986S319.pdf 10.35940/ijeat.F1034.0986S319
spellingShingle Abdullah, Muhamad Taufik
Alotaibi, Faiz
Azmi Murad, Masrah Azrifah
O.K. Rahmat, Rahmita Wirza
Abdullah, Rusli
A method for Arabic handwritten diacritics characters
title A method for Arabic handwritten diacritics characters
title_full A method for Arabic handwritten diacritics characters
title_fullStr A method for Arabic handwritten diacritics characters
title_full_unstemmed A method for Arabic handwritten diacritics characters
title_short A method for Arabic handwritten diacritics characters
title_sort method for arabic handwritten diacritics characters
url http://psasir.upm.edu.my/id/eprint/80428/
http://psasir.upm.edu.my/id/eprint/80428/
http://psasir.upm.edu.my/id/eprint/80428/
http://psasir.upm.edu.my/id/eprint/80428/1/ARABIC.pdf